Options

Matrix is singular

tbhuytbhuy Member Posts: 7 Contributor II
Hi everybody,
I'm using Rapid miner to realize a experiment about RDA. for my datasets, the application says: the matrix is singular.
For me, RDA is a method which can eliminate the singularity of matrix. It's abnormal? Can you help me please?
Thanks
  Huy

Answers

  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Huy,
    which parameter value have you inserted for alpha?

    Greetings,
      Sebastian
  • Options
    tbhuytbhuy Member Posts: 7 Contributor II
    I have tried almost value of alpha, but the same problem occurs.
    I have tried different dataset to.
    Thanks for your advices.
  • Options
    tbhuytbhuy Member Posts: 7 Contributor II
    This is a sample code:
    <operator name="Root" class="Process" expanded="yes">
        <operator name="ArffExampleSource" class="ArffExampleSource">
            <parameter key="data_file" value="C:\Documents and Settings\Administrator\Desktop\lung.arff"/>
            <parameter key="label_attribute" value="class"/>
        </operator>
        <operator name="RegularizedDiscriminantAnalysis" class="RegularizedDiscriminantAnalysis" breakpoints="after">
            <parameter key="alpha" value="0.85"/>
        </operator>
    </operator>


    here the error:
      Root[1] (Process)
      +- ArffExampleSource[1] (ArffExampleSource)
      +- RegularizedDiscriminantAnalysis[1] (RegularizedDiscriminantAnalysis)
    G Aug 16, 2009 10:19:32 AM: [Fatal] RuntimeException occured in 1st application of RegularizedDiscriminantAnalysis (RegularizedDiscriminantAnalysis)
    G Aug 16, 2009 10:19:32 AM: [Fatal] Process failed: operator cannot be executed (Matrix is singular.). Check the log messages...
              Root[1] (Process)
              +- ArffExampleSource[1] (ArffExampleSource)
    here ==> +- RegularizedDiscriminantAnalysis[1] (RegularizedDiscriminantAnalysis)

    I just created 2 process.
    Thanks a lot!
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Huy,
    unfortunately I cannot reproduce your error. Does it occur on one of the example set generators? If yes, then please post the process with the generator. This would be a great help for finding the error...

    Greetings,
      Sebastian
  • Options
    tbhuytbhuy Member Posts: 7 Contributor II
    Hi Sebastian,

    I use the dataset in UCI repository. They are all real dataset. I have tried several dataset.
    The error occurs overtime. Now, i realize that the musk dataset don't cause any problem.
    (the error i have posted is occurred with the lung cancer dataset)
    The other problem is the accuracy in RDA is much lower than RDA anh QDA. It is normal?
    Thanks for yours helps,
    Huy
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    the lung cancer data set contains missing values. Neither RDA, nor LDA or QDA can cope with that. Did you replace them beforehand using the replace Missing values operator?

    Greetings,
      Sebastian
  • Options
    tbhuytbhuy Member Posts: 7 Contributor II
    Hi,
      I have already delete missing data. The problem is that the matrix is singular.
    The principal thing is the class precision is unchanged when I modify the value of alpha.
      Is there any wrong with my process? 

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ArffExampleSource" class="ArffExampleSource">
            <parameter key="data_file" value="C:\Documents and Settings\Administrator\Desktop\ds\musk.arff"/>
            <parameter key="label_attribute" value="class"/>
        </operator>
        <operator name="XValid" class="XValidation" expanded="yes">
            <operator name="RegularizedDiscriminantAnalysis" class="RegularizedDiscriminantAnalysis">
                <parameter key="keep_example_set" value="true"/>
                <parameter key="alpha" value="0.75"/>
            </operator>
            <operator name="ApplierChain" class="OperatorChain" expanded="yes">
                <operator name="Applier" class="ModelApplier">
                    <parameter key="keep_model" value="true"/>
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="Evaluator" class="Performance">
                    <parameter key="use_example_weights" value="false"/>
                </operator>
            </operator>
        </operator>
    </operator>

    Thanks a lot,
    Waiting for your help!
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    your process setup is fine, if the missing values are replaced beforehand in another process. Then this should work exactly in the way you intend. Could you please provide me with the link on a data set on which the error occurs and post here every preprocessing process applied on this data before?
    As I said: I cannot reproduce the error, that makes bug tracing impossible...

    Greetings,
      Sebastian
  • Options
    tbhuytbhuy Member Posts: 7 Contributor II
    hi,
    here the musk dataset which I have tested for RDA.
    http://www.mediafire.com/download.php?aigkyqmyjdd
    I have changed almost value of alpha but the classification precision is unchanged.
    here the lung cacer dataset in which I have deleted missing value
    http://www.mediafire.com/download.php?tjjmqwwttwe
    This dataset cause the singular matrix problem.

    Thanks,
    Huy
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    yes I see the problem. Probably it would be the better way to add the unit covariance matrix to the estimated covariance matrix BEFORE inverting. I will see if I could change that.

    Greetings,
      Sebastian
  • Options
    tbhuytbhuy Member Posts: 7 Contributor II
    Hi,
      Can I repair this problem myself? Because I must present the the RDA algorithm in the next few days.
      If you have solved it already, could you please post your correct code?
    Thanks,
    Huy
     
  • Options
    landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Huy,
    unfortunately I haven't found the time yet. Sorry about that, it will take some time, since we are currently under heavy workload.
    If you have the program code, then you might solve the problem yourself. I think, the problem is, that first the inverse of the covariance is built and then the identity matrix is added. If the Identitiy matrix would be added at first, the matrix shouldn't be singular any more during inversion.

    Greetings,
      Sebastian
Sign In or Register to comment.