[SOLVED]the problem of setup data set meta data information

winecoding
winecoding New Altair Community Member
edited November 2024 in Community Q&A
I am trying to load a big csv file(about 18G)  into rapidminer for building a classification model. The “import configuration wizard” seems has difficulty in loading the data. Therefore, I choose to use the “Edit parameter list: data set meta data information” to set up the attribute and label information. However, the UI-interface only allows me to setup those information column-by-column. My csv file has about 80000 columns. How should I handle this kind of scenario? Thanks.

Tagged:

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    It may be a good idea to import the data into a database. MySQL for example offers command line options to create a table directly from a CSV file. Once having the data in a database, it will also be easier to process it with RapidMiner - even if you have a powerful machine with sufficient RAM, performing any kind of operation on 18GB of data will require a lot of patience. Usually, you want to work only on a subset (sample) of the complete data, and the database can help you to easily access those parts.

    Best regards,
    Marius

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.