CachedDatabaseExampleSource + Problems

I will use a lot of data and many simple caluculatiion.
For this, there Rapidminer suggest to use the CachedDatabaseExampleSource.
I have create this model, but it works first, after some minutes there were interrupt the process.
I got the information: Feb 15, 2010 11:48:43 AM WARNING: Caught exception in concurrent execution of FS (Optimize Selection): java.lang.OutOfMemoryError: Java heap space

What can I do??

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
  <operator activated="true" class="process" expanded="true" name="Root">
    
    <process expanded="true" height="280" width="195">
      <operator activated="true" class="stream_database" expanded="true" height="60" name="CachedDatabaseExampleSource" width="90" x="45" y="120">
        <parameter key="define_connection" value="url"/>
        <parameter key="database_url" value="jdbc:mysql://localhost:3306/lai_gesamtzeitraum_aisa_baender"/>
        <parameter key="username" value="root"/>
        <parameter key="password" value="w3mYv/Z+Ew7XwAzejr7xJA=="/>
        <parameter key="table_name" value="lai_gesamtzeitraum_aisa_baender"/>
        <parameter key="recreate_index" value="true"/>
        <parameter key="label_attribute" value="lai"/>
        <parameter key="id_attribute" value="id"/>
      <operator activated="true" class="loop_batches" expanded="true" height="60" name="BatchProcessing" width="90" x="45" y="210">
        <parameter key="parallelize_batch_process" value="true"/>
        <process expanded="true" height="280" width="145">
          <operator activated="true" class="generate_function_set" expanded="true" height="76" name="CompleteFeatureGeneration" width="90" x="45" y="30">
            <parameter key="use_plus" value="true"/>
          <operator activated="true" class="rename_by_constructions" expanded="true" height="76" name="Construction2Names" width="90" x="45" y="120"/>
          <operator activated="true" class="optimize_selection" expanded="true" height="94" name="FS" width="90" x="45" y="210">
            <parameter key="limit_number_of_generations" value="true"/>
            <parameter key="keep_best" value="3"/>
            <parameter key="maximum_number_of_generations" value="1"/>
            <parameter key="local_random_seed" value="-1"/>
            <process expanded="true" height="100" width="145">
              <operator activated="true" class="bootstrapping_validation" expanded="true" height="112" name="BootstrappingValidation (2)" width="90" x="45" y="30">
                <parameter key="local_random_seed" value="-1"/>
                <process expanded="true" height="100" width="30">
                  <operator activated="true" class="linear_regression" expanded="true" height="76" name="LinearRegression (3)" width="90" x="-70" y="30"/>
                  <connect from_port="training" to_op="LinearRegression (3)" to_port="training set"/>
                  <connect from_op="LinearRegression (3)" from_port="model" to_port="model"/>
                  <portSpacing port="source_training" spacing="0"/>
                  <portSpacing port="sink_model" spacing="0"/>
                  <portSpacing port="sink_through 1" spacing="0"/>
                <process expanded="true" height="100" width="195">
                  <operator activated="true" class="apply_model" expanded="true" height="76" name="ModelApplier (3)" width="90" x="45" y="30">
                    <list key="application_parameters"/>
                  <operator activated="true" class="performance_regression" expanded="true" height="76" name="RegressionPerformance" width="90" x="95" y="30">
                    <parameter key="main_criterion" value="squared_correlation"/>
                    <parameter key="root_mean_squared_error" value="true"/>
                    <parameter key="squared_correlation" value="true"/>
                  <connect from_port="model" to_op="ModelApplier (3)" to_port="model"/>
                  <connect from_port="test set" to_op="ModelApplier (3)" to_port="unlabelled data"/>
                  <connect from_op="ModelApplier (3)" from_port="labelled data" to_op="RegressionPerformance" to_port="labelled data"/>
                  <connect from_op="RegressionPerformance" from_port="performance" to_port="averagable 1"/>
                  <portSpacing port="source_model" spacing="0"/>
                  <portSpacing port="source_test set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_averagable 1" spacing="0"/>
                  <portSpacing port="sink_averagable 2" spacing="0"/>
              <connect from_port="example set" to_op="BootstrappingValidation (2)" to_port="training"/>
              <connect from_op="BootstrappingValidation (2)" from_port="averagable 1" to_port="performance"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_performance" spacing="0"/>
          <connect from_port="exampleSet" to_op="CompleteFeatureGeneration" to_port="example set input"/>
          <connect from_op="CompleteFeatureGeneration" from_port="example set output" to_op="Construction2Names" to_port="example set input"/>
          <connect from_op="Construction2Names" from_port="example set output" to_op="FS" to_port="example set in"/>
          <portSpacing port="source_exampleSet" spacing="0"/>
      <operator activated="true" class="write_database" expanded="true" height="60" name="DatabaseExampleSetWriter" width="90" x="95" y="120">
        <parameter key="define_connection" value="url"/>
        <parameter key="database_url" value="jdbc:mysql://localhost:3306/lai_gesamtzeitraum_aisa_baender"/>
        <parameter key="username" value="root"/>
        <parameter key="password" value="w3mYv/Z+Ew7XwAzejr7xJA=="/>
        <parameter key="table_name" value="output"/>
      <connect from_op="CachedDatabaseExampleSource" from_port="output" to_op="BatchProcessing" to_port="example set"/>
      <connect from_op="BatchProcessing" from_port="example set" to_op="DatabaseExampleSetWriter" to_port="input"/>
      <connect from_op="DatabaseExampleSetWriter" from_port="through" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>


    Hi Angela,
    this is not a problem of the cached database source, but of the feature selection operator. I would advise to change to RapidMiner 5.0, since it includes new feature selection operators, limiting the memory consumption to a very small overhead. On 4.x these operators were only available for paying customers.

    halle Sebastian,

    many thanks for your answer. I have also chance the rapid miner version from 4.6 to 5 for using the seleciton function in combination with the cachedDatabaseExampleSource. I have alrady parameter of mysql database increase.
    But first until it seemed to work. Then the process was aborted.
    I think I have ca. 300 different variables. And I will create different functions. So that means, that it will create 300x300 new variables, if I unly use one new fuction (summe, difference beetween the variables).
    So I think, rapidminer has a limit in handing so much variables.
    If you have yet another idea, I would be interested for this.

    best regards

    Hi Angela,
    RapidMiner copes with 90.000 attributes very well. In fact we have several problems, where they occur. But Databases are limited to around one thousand columns, so you cannot store this data set into a table! If you are using a cachedDatabaseExampleSource (or the Stream Database operator of RapidMiner 5) it always will store the data in the database and hence crashes.
    Did you already tried the YAGGA operators? They will construct new attributes in a more directed fashion using genetic approaches?

