Options

Extract a subset of examples by using a list of IDs from a different ExampleSet

schwoeschwoe Member Posts: 1 Contributor I
Hi,

I have two ExampleSets which both have a different set of attributes, but the same IDs (as a string value). I import these data from csv-files. I want to get IDs from one ExampleSet by checking for a certain value for an attribute and use this list of IDs to filter Examples from the other ExampleSet.

In SQL Syntax it would be something like this:

SELECT * FROM table_1 WHERE table_1.id IN (SELECT id FROM table_2 WHERE table_2.attribute = value)

Is it possible to somehow save multiple values to a macro and loop over these to filter the examples from the other ExampleSet or what would be the best way to perform this task in RapidMiner?
Or would it be wise to code my own operator to do the operation or to perform this operation outside of RapidMiner? (E.g. by writing the data to an SQL-Database first)

btw: I am using the Community Edition of RapidMiner.

Answers

  • Options
    MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Have a look at the Join operator. Use join_type "inner" and configure the join_attributes to use the ID attributes of your data sets.
    The Join operator works the same way as the SQL JOIN statement.

    Best regards,
    Marius
Sign In or Register to comment.