Options

unclear matrix for MetaCost operator

dan_agapedan_agape Member Posts: 106 Maven
edited November 2018 in Help

Any answer/comment to the question below would be appreciated. Many thanks,
Dan

When defining the cost matrix of the MetaCost operator, the names class 1 and class 2 appear (suppose you have 2 classes only). How do you know which is what (for instance class 1 is "Yes" and class 2 is "No")? Perhaps there is an obvious/user friendly solution - but I do not see it.

Answers

  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi Dan,

    Sadly there isn't really a better answer than the source code, which I think passes the matrix from the parameters to the model without changing the order.

    In the operator we have this before the learning...( MetaCost.java )
    //get cost matrix
    double[][] costMatrix = getParameterAsMatrix(PARAMETER_COST_MATRIX);
    and this to produce the actual model...
    return new MetaCostModel(inputSet, models, costMatrix);
    and this to show that the same thing gets stored in the new model...
    public MetaCostModel(ExampleSet exampleSet, Model[] models, double[][] costMatrix) {
    super(exampleSet);
    this.models = models;
    this.costMatrix = costMatrix;
    }
    Actually the same answer probably applies to your other questions this evening, you need to check the code out for yourself; and believe me, if I can manage it, anyone can!

    So I leave you with the pleasures of Dark Vega  8)
  • Options
    dan_agapedan_agape Member Posts: 106 Maven
    Hi Haddock,

    Thanks - that's been useful.

    RM is an excellent and impressing DM suite on many aspects - however user friendly-ness is essential for a software to become significantly important on this competitive market. I wander however if, in the commercial versions, the meaning of the columns/rows in the confusion matrix is obvious (otherwise one can include a particular higher cost but one does not know for which class).

    By the way I have tried to use the Weka MetaCost operator instead - just to stick to the process of modeling via the GUI, but there the inner operator that builds the model cannot be linked to the outer operator to get the dataset and return the model.

    Best,
    Dan



  • Options
    dan_agapedan_agape Member Posts: 106 Maven
    confusion matrix above to be read cost matrix
  • Options
    haddockhaddock Member Posts: 849 Maven
    Hi there,

    The RM Freebie does not have different functionality from the commercial version as far as I know. I'm with you on the need for handy help; equally we could make it ourselves... after all this is open source software  8)
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi all,
    on the first sight your problem seems to be easily solvable: Read the data, fetch possible class labels and show them to the user. Just: It's not that easy. Before the data reaches the operator, whose parameters you are going to set, it passes many other operators. So RM would have to execute all of them to be sure, which labels actually are present. This might take any arbitrary time (as usual for data mining processes). Since this, it's much more complicated. We started first steps with the so called MetaData transformation, where only data about the data is handled, which already solves many problems like attribute selection, etc. A priceless feature if you ever tried software without it...
    But you cannot rely on this for such an important feature, because many transformation cannot be simulated without taking the real data into account.

    As a way out, you can explicitly remap your label attribute, so that you know the order of the classes.

    Anyway, we are working hard on further improve the user friendly ness and ease of use of our software. You might add a feature request to our bug tracker, so that we can't forget this. And if you become enterprise customer, I promise you, we will immideately attach any information in the meta data to this matrix. (Just to show you, why it might be worth to become enterprise customer. It makes us jump if you call...)

    Greetings,
      Sebastian
  • Options
    dan_agapedan_agape Member Posts: 106 Maven
    Hi Sebastian,

    Thanks very much for your answer. I am quite new to RM (I am exploring some mature DM suites to consider for my business in the future). Can you please tell me how to explicitly remap the label attribute such that the order of its values is known? Thanks.

    Best
    Dan

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    if you have binominal labels, you can simply use the remap binominals operator to define which class is negative (=first) or positive(=second).

    Anyway you can take a look at the meta data view of your example set. In the column Range a list of all possible nominal values is given. This list is in order of internal mapping.

    Greetings,
      Sebastian
  • Options
    dan_agapedan_agape Member Posts: 106 Maven
    Sebastian I have tested your first suggestion, it worked.

    However, I am still not quite convinced about your second suggestion. What I could observe using several datasets, is that if you evaluate a model built with the MetaCost operator, then always both the confusion matrix and the cost matrix respect the same order of classes as columns. However, this order is not always the same with the one of the values in the list of the label attribute in the meta data, as you suggest. I used no remapping in this case.

    Best,
    Dan
Sign In or Register to comment.