Options

How to preprocess US Census Data (1990) Data Set

SyltenSylten Member Posts: 2 Contributor I
edited November 2018 in Help
Hi!

I have this assignment in my Data Mining class. The assignment is to analyze US Census Data (https://archive.ics.uci.edu/ml/datasets/US+Census+Data+(1990)) which containts 68 attributes and way too many examples for my computer to handle. I'm trying to use RapidMiner to analyze this data.

I haven't been able to read the data successfully without RapidMiner shutting down due to memory loss.

I have tried to read the txt file directly using ReadCSV. Since this didn't work I converted the txt file to a csv file. This didn't work either. Then I tried to filer out all examples except for a few while at the same time removing all attributes except 4. This doesn't work either.

Do any of you guys have any idea of what I should do in order to read this data successfully in RapidMiner?

Thank you for your time.

Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,511 RM Data Scientist
    Hello Sylten,

    You want to read in this data file: https://archive.ics.uci.edu/ml/machine-learning-databases/census1990-mld/USCensus1990.data.txt right?

    Best,

    Martin

    Edit: I've read the linked data file into my rapidminer. Are you using the starter edition of RM 6.2? In that edition you only have 1GB of Ram, so that might be the reason.

    There might be the option of an academic licence. Just write a mail to me: mschmitz@rapidminer.com
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    SyltenSylten Member Posts: 2 Contributor I
    Yes that's right
Sign In or Register to comment.