Options

Read SPSS ==> Attribute Selector returns no attributes at all

tommedematommedema Member Posts: 4 Contributor I
edited November 2018 in Help
Hello everyone,

I'm new here so please be gentle on me. ;)

Anyway, I have an SPSS file ready to be imported into RapidMiner. I've created an SPSS Reader from import --> data --> SPSS.

The reader works just fine and when I set it to output immediately I get a nice table of my data.

However, I need to filter this data (I need to remove some attributes).So I added an "Attribute Selector" and connected the two. This worked just fine when importing normal CSV files.

For some reason the Attribute Selector node never receives any output from the SPSS node. The same happens when I put a node like the Sample filter node in there: it never receives output from SPSS Reader. Yet, the SPSS reader does return my table correctly when I assign it directly to the final result port.

What might be causing this? Am I doing something wrong?

Answers

  • Options
    Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University Professor Posts: 1,993 RM Engineering
    Hi,

    the output is recieved normally, however what you are missing is the metadata for the ReadSPSS operator, which is used to show the expected outcome after each operator to ease process design. Missing metadata there will not break your process, you can still put for example the Select Attributes operator after it and it will work anyway.
    However I don't see how you can see metadata for the Read CSV operator, it will also not show any metadata because it would actually have to read the file for that and that is often not desired (and can severely impact performance)

    Regards,
    Marco
  • Options
    tommedematommedema Member Posts: 4 Contributor I
    Sorry, but how does that help me?

    As mentioned, the attribute selector returns no elements. So how do I select the elements that it should not filter out?
  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    umm, by simply typing in the names of the desired attributes? This is possible either in the subset selection GUI by typing and adding each desired attribute or by using a regular expression by using "|" between the attribute names.

    If you want the most comfortable way just use the preferred way which is also suggested in the manual: load the data and directly store it in your repository with the operator "Store". Use the repository entry then, it will deliver all meta data and you can use those at all places during process design.

    Cheers,
    Ingo
  • Options
    tommedematommedema Member Posts: 4 Contributor I
    Ingo Mierswa wrote:

    Hi,

    umm, by simply typing in the names of the desired attributes? This is possible either in the subset selection GUI by typing and adding each desired attribute or by using a regular expression by using "|" between the attribute names.

    If you want the most comfortable way just use the preferred way which is also suggested in the manual: load the data and directly store it in your repository with the operator "Store". Use the repository entry then, it will deliver all meta data and you can use those at all places during process design.

    Cheers,
    Ingo
    The problem is that when I click 'subset' I cannot type in anything. I must use the GUI window and it shows no attributes in the list.

    So how do I for example set it to only allow variables test1, test2 and test 3?

    Also, the Store method does not allow me to import SPSS files.
  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi,

    The problem is that when I click 'subset' I cannot type in anything. I must use the GUI window and it shows no attributes in the list.
    Exactly. Just type in the attribute names in this GUI window. Open it by clicking "Select Attributes..." in the operator's parameters. In the window, look at the right side ("Selected Attributes"), type in the first name of your attribute in the text field at the top ("[Filter]") and press the plus icon beside of the text field (don't worry: while typing in the second attribute, the list below is likely to become smaller or even empty since the same text field is also used as a search filter as indicated by the name). Tada... the attribute will now be in the list of the selected attributes at the bottom.

    The other option would be to use the parameter setting "regular_expression" and use the expression "test1|test2|test3". Easy for only a few attributes but less comfortable for hundreds of them  ;)


    As I said before, the best way is to use the repository to get all the good things about the meta data propagation (read the manual about how and why). So we should resolve the following problem:

    Also, the Store method does not allow me to import SPSS files.
    Of course not. It should not import, but store. Import it with the, well, import operator "Read SPSS". So you will end with a small import process consisting of only two operators: "Read SPSS" and "Store". In a second (third...) process, you can start with the actual data transformation and analysis from the example set stored in the repository and the SPSS file will no longer be used. Then you will always get the (transformed) meta data during the process design - at least as far as possible.

    Cheers,
    Ingo
  • Options
    tommedematommedema Member Posts: 4 Contributor I
    Excellent. The first method works fine, although trying to use the Store node was not very successful. While it seemed to work it did not seem to add metadata and thus the attributes list was still empty.

    Thanks a lot for the help.
  • Options
    IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Hi again,

    glad to hear that.

    About the store: Did you really have used two separate processes? (First one: just read the data and store it; second one: use the stored data set as starting point for the selection). From your text I would assume that you only have used "Store" which alone is of course not going to help at all. Please refer to the manual about more information about how to use the repository and why - it is really worth the efforts since it will dramatically increase your RapidMiner experience  ;)

    Cheers,
    Ingo
Sign In or Register to comment.