Options

Map operator function

NairiNairi Member Posts: 4 Newbie
edited May 4 in Help
Hi. First off, this may or may not relate to RapidMiner's problem directly. It's just that i am stuck with my data, to go through with using Map operator (as part of ETL process). Please note that I am first and foremost a language student and I had to take this course as a requirement. So, i basically have zero knowledge on this.

This is the data I'm using and as you can see, they are mostly numerical and just numbers. While the sample i learn how to use Map operator from, uses Map for changing types of Gender(attribute), or at least something other than numbers. The question is, can someone show me how we use Map operator for something very vast and has different values every time (e.g. passengers count, seats count, distance of flight like in the data)? I'm thinking, it may be possible if i choose only some of the data and use the Map operator for attributes like 'Destination_city'(which has more consistent data meaning there are 100 flights that leave from Nevada for example). But this is probably a whole another thing? I am confused. If theres anyone that can help me, I would greatly appreciate it.

p/s: i cannot post link to the data bc im still a newbie

Best Answers

  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,517 RM Data Scientist
    Solution Accepted
    Hey,
    can you maybe post example data and how you want to transform it?

    Cheers,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Options
    NairiNairi Member Posts: 4 Newbie
    Solution Accepted
    The dataset has these attributes:
    1. Origin_airport: Three letter airport code of the origin airport
    2. Destination_airport: Three letter airport code of the destination airport
    3. Origin_city: Origin city name
    4. Destination_city: Destination city name
    5. Passengers: Number of passengers transported from origin to destination
    6. Seats: Number of seats available on flights from origin to destination
    7. Flights: Number of flights between origin and destination (multiple records for one month, many with flights > 1)
    8. Distance: Distance (to nearest mile) flown between origin and destination
    9. Fly_date: The date (yyyymm) of flight
    10. Origin_population: Origin city's population as reported by US Census
    11. Destination_population: Destination city's population as reported by US Census
    I am pretty sure these data do not have missing values because Rapidminer does not detect it in the system. But i am confused as to how I can use the Map operator for this dataset. Or should I just not use it and go straight to the next ETL, Data Type Conversion? I do not have any idea how I want to transform the data, very sorry.
  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,517 RM Data Scientist
    Solution Accepted
    Hi,
    what do you want to map? Whats the task at hand?

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • Options
    NairiNairi Member Posts: 4 Newbie
    The task is to run the dataset through ETL process. But at this point, i think there is no attribute that needs to be mapped or changed. I was thinking to just skip this particular operator or still put it on the process canvas without changing/inputting anything in the parameters section and still be able to run the whole process.
  • Options
    NairiNairi Member Posts: 4 Newbie

    Hello, I have a new question. I apologize and thanks in advance.

    This is about using 'General Attributes' operator. How do I code here if I want to select 'Flights' that are more than 200 to be labeled as Active, and those under 200 flights as Nonactive?
    It's something like this:
    Active flights: >200
    Non active flights: <200

    Thank you so much. 
  • Options
    MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,517 RM Data Scientist
    Hi @Nairi ,
    operator toolbox extension got an operator called "Replace Rare Values". Thats the easiest way.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.