Options

# Create Objective Function fro Evolutionary Algorithm (Parameter Optimization)

Member Posts: 9 Contributor II
edited November 2018 in Help
Hi Dear Community

I have been working on price forecasting by support vector regression for my thesis. I created relevant features with the prices. I created the model which implements feature selection and parameter optimization together. I used evolutionary algorithm to select best "k" attributes (svm attribute weight algorithm was used and "k" attributes selected) and support vector regression parameters (nu, gamma, C). I also combined performance vectors by combine performance operator. Root mean squared error and number of attributes was selected as criterias. The weight of root mean squared error is 0.7 and the weight of number of attributes is 0.3. The model works great. The model finds best parameter and minimum nuber of feature.

My problem is to add correletion into the combine performance as criteria. Root mean squared error and number of attributes was optimized  to minimization way but correlation must be optimized to maximization way. How can i combine these three criteria in combine performance operator? I looked at create formula operation. Maybe it can help me? but i dont know how can it work? I am waiting for your help.

Another question is evolutionary optimization stops before number of max generations. Max generations is 100 but generally it stops in 50th generations. Do you know anything about it?
`<?xml version="1.0" encoding="UTF-8" standalone="no"?><process version="5.2.008">  <context>    <input/>    <output/>    <macros/>  </context>  <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">    <description>Pazar:Weight By PCA (5), Gauss,Tour (0.25), Prob(0.9),Keep BestParameter set:Performance: PerformanceVector [*****root_mean_squared_error: 15.123132 +/- 1.558278 (mikro: 15.203202 +/- 0.000000)-----absolute_error: 10.749156 +/- 1.021879 (mikro: 10.749156 +/- 10.751419)-----relative_error: 8.83% +/- 0.96% (mikro: 8.83% +/- 11.87%)-----correlation: 0.893216 +/- 0.021480 (mikro: 0.892177)-----spearman_rho: 0.897900 +/- 0.023013 (mikro: 5.387402)-----kendall_tau: 0.730874 +/- 0.031571 (mikro: 4.385243)]SVM.C	= 349.34275575370486SVM.nu	= 0.4741072199572831SVM.gamma	= 1.0E-5Select by Weights.k	= 33C:349.33742656664316	Nu:0.4291679723296585	Gamma:1.0E-5	 RMSE:13.069646621784996	MAPE:0.07480695227635964	MAE:9.443566849050331	Corr:0.9176969415924828	K:26.0Pazar:Weight By Relief (20) (NonNormalized, Gauss,Tour (0.25), Prob(0.9)C:324.82805462669523	NU:0.46831863984298205	Gamma:7.099485940907746E-5	RMSE:11.20861632389257	MAPE0.06294480347697497	MAE:7.949018244089598	Corr:0.9374834724799084	K:33.0CumartesiWeight By Relief (20) (NonNormalized, Gauss,Boltzmann, Prob(0.9)PerformanceVector [*****root_mean_squared_error: 12.903715 +/- 1.111219 (mikro: 12.951474 +/- 0.000000)-----absolute_error: 9.139935 +/- 0.653655 (mikro: 9.139935 +/- 9.176180)-----relative_error: 7.02% +/- 0.54% (mikro: 7.02% +/- 8.97%)-----correlation: 0.929155 +/- 0.012758 (mikro: 0.929126)-----spearman_rho: 0.924755 +/- 0.011501 (mikro: 5.548529)-----kendall_tau: 0.772453 +/- 0.017273 (mikro: 4.634719)]SVM.C	= 324.82805462669523SVM.nu	= 0.46831863984298205SVM.gamma	= 7.099485940907746E-5Select by Weights.k	= 33PazartesiWeight By Relief (20) (Non-Normalized, Switch,Boltzmann, Prob(0.9)PerformanceVector [*****root_mean_squared_error: 14.406567 +/- 1.108196 (mikro: 14.450276 +/- 0.000000)-----absolute_error: 10.157388 +/- 0.753901 (mikro: 10.158396 +/- 10.277037)-----relative_error: 7.63% +/- 0.95% (mikro: 7.63% +/- 9.78%)-----correlation: 0.886775 +/- 0.013296 (mikro: 0.888370)-----spearman_rho: 0.899418 +/- 0.012317 (mikro: 5.396508)-----kendall_tau: 0.736149 +/- 0.015988 (mikro: 4.416893)]SVM.C	= 324.82805462669523SVM.nu	= 0.4746549118743857SVM.gamma	= 7.099485940907746E-5Select by Weights.k	= 33C:324.82805462669523	Nu:0.4746549118743857	Gamma:7.099485940907746E-5	RMSE:14.273522402331189	MAPE:0.07110594344955053	MAE:9.718228530010594	Corr:0.8747126747273638	K:33.0C:324.82805462669523	Nu:0.4746549118743857	Gamma7.099485940907746E-5	RMSE:14.13840447755943	MAPE:0.07282235225443726	MAE:9.781834957950943	Corr:0.8722168856716528	K:25.0Hafta İçiWeight By Relief (20) (Non-Normalized, Switch,Boltzmann, Prob(0.9)Performance: PerformanceVector [*****root_mean_squared_error: 11.759630 +/- 1.530881 (mikro: 11.858857 +/- 0.000000)-----absolute_error: 7.690999 +/- 0.924004 (mikro: 7.690999 +/- 9.026684)-----relative_error: 5.49% +/- 0.70% (mikro: 5.49% +/- 7.37%)-----correlation: 0.931027 +/- 0.013617 (mikro: 0.930025)-----spearman_rho: 0.935731 +/- 0.008928 (mikro: 5.614387)-----kendall_tau: 0.794053 +/- 0.019052 (mikro: 4.764317)]SVM.C	= 324.82805462669523SVM.nu	= 0.4746549118743857SVM.gamma	= 1.9062962726105077E-4Select by Weights.k	= 36C:324.82805462669523	NU:0.4746549118743857	Gamma:1.9062962726105077E-4	RMSE:11.007134982553742	MAPE:0.05390369973896053	MAE:7.495131592979047	Corr:0.9348047744384642	K:36.0C:324.82805462669523	NU:0.4746549118743857	Gamma:7.099485940907746E-5	RMSE:11.4593106153097	MAPE:0.05537831381509461	MAE:7.888946409682697	Corr:0.9337350092766873	K:33.0C:188.2015332981262	NU:0.29386413755875573	Gamma:1.9062962726105077E-4	RMSE:11.526658754110203	MAPE:0.05801719898871299	MAE:7.9792785705712825	Corr:0.9325221843509023	K:36.0</description>    <parameter key="parallelize_main_process" value="true"/>    <process expanded="true" height="341" width="435">      <operator activated="true" class="read_excel" compatibility="5.2.008" expanded="true" height="60" name="Read Excel" width="90" x="45" y="30">        <parameter key="excel_file" value="C:\Users\KenanB\Desktop\TEZ\PTF\TrainDataClusByDay.xlsx"/>        <parameter key="sheet_number" value="2"/>        <parameter key="imported_cell_range" value="A1:AL1045"/>        <parameter key="first_row_as_names" value="false"/>        <list key="annotations">          <parameter key="0" value="Name"/>        </list>        <list key="data_set_meta_data_information">          <parameter key="0" value="Tarih.true.date_time.id"/>          <parameter key="1" value="SGOF.true.numeric.label"/>          <parameter key="2" value="1DayLag.true.numeric.attribute"/>          <parameter key="3" value="2DayLag.true.numeric.attribute"/>          <parameter key="4" value="3DayLag.true.numeric.attribute"/>          <parameter key="5" value="1WeekLag.true.numeric.attribute"/>          <parameter key="6" value="Volatilite.true.real.attribute"/>          <parameter key="7" value="MACD.true.real.attribute"/>          <parameter key="8" value="1HaftaSaatlikOrtalama.true.real.attribute"/>          <parameter key="9" value="1HftSaOrtEksiSapma.true.real.attribute"/>          <parameter key="10" value="1HftSaOrtArtıSapma.true.real.attribute"/>          <parameter key="11" value="4HaftaOrtalama.true.numeric.attribute"/>          <parameter key="12" value="2HaftaOrtalama.true.numeric.attribute"/>          <parameter key="13" value="KGUP-BilateralAggreements.true.numeric.attribute"/>          <parameter key="14" value="Saat0.true.integer.attribute"/>          <parameter key="15" value="Saat1.true.integer.attribute"/>          <parameter key="16" value="Saat2.true.integer.attribute"/>          <parameter key="17" value="Saat3.true.integer.attribute"/>          <parameter key="18" value="Saat4.true.integer.attribute"/>          <parameter key="19" value="Saat5.true.integer.attribute"/>          <parameter key="20" value="Saat6.true.integer.attribute"/>          <parameter key="21" value="Saat7.true.integer.attribute"/>          <parameter key="22" value="Saat8.true.integer.attribute"/>          <parameter key="23" value="Saat9.true.integer.attribute"/>          <parameter key="24" value="Saat10.true.integer.attribute"/>          <parameter key="25" value="Saat11.true.integer.attribute"/>          <parameter key="26" value="Saat12.true.integer.attribute"/>          <parameter key="27" value="Saat13.true.integer.attribute"/>          <parameter key="28" value="Saat14.true.integer.attribute"/>          <parameter key="29" value="Saat15.true.integer.attribute"/>          <parameter key="30" value="Saat16.true.integer.attribute"/>          <parameter key="31" value="Saat17.true.integer.attribute"/>          <parameter key="32" value="Saat18.true.integer.attribute"/>          <parameter key="33" value="Saat19.true.integer.attribute"/>          <parameter key="34" value="Saat20.true.integer.attribute"/>          <parameter key="35" value="Saat21.true.integer.attribute"/>          <parameter key="36" value="Saat22.true.integer.attribute"/>          <parameter key="37" value="Saat23.true.integer.attribute"/>        </list>      </operator>      <operator activated="true" class="weight_by_svm" compatibility="5.2.008" expanded="true" height="76" name="Weight by SVM" width="90" x="101" y="121">        <parameter key="normalize_weights" value="false"/>        <parameter key="C" value="300.0"/>      </operator>      <operator activated="true" class="parallel:optimize_parameters_evolutionary_parallel" compatibility="5.1.000" expanded="true" height="130" name="Optimize Parameters (Evolutionary)" width="90" x="246" y="30">        <list key="parameters">          <parameter key="SVM.C" value="[100;500]"/>          <parameter key="SVM.nu" value="[0.01;0.5]"/>          <parameter key="SVM.gamma" value="[0.000001;0.01]"/>          <parameter key="Select by Weights.k" value="[1;36]"/>        </list>        <parameter key="max_generations" value="100"/>        <parameter key="population_size" value="8"/>        <parameter key="keep_best" value="false"/>        <parameter key="selection_type" value="roulette wheel"/>        <parameter key="crossover_prob" value="0.5"/>        <parameter key="number_of_threads" value="8"/>        <parameter key="parallelize_optimization_process" value="true"/>        <process expanded="true" height="296" width="524">          <operator activated="true" class="select_by_weights" compatibility="5.2.008" expanded="true" height="94" name="Select by Weights" width="90" x="45" y="30">            <parameter key="weight_relation" value="top k"/>            <parameter key="k" value="3"/>          </operator>          <operator activated="true" class="x_validation" compatibility="5.2.008" expanded="true" height="112" name="Validation" width="90" x="179" y="30">            <parameter key="number_of_validations" value="5"/>            <parameter key="sampling_type" value="shuffled sampling"/>            <parameter key="parallelize_training" value="true"/>            <parameter key="parallelize_testing" value="true"/>            <process expanded="true" height="332" width="330">              <operator activated="true" class="support_vector_machine_libsvm" compatibility="5.2.008" expanded="true" height="76" name="SVM" width="90" x="120" y="30">                <parameter key="svm_type" value="nu-SVR"/>                <parameter key="gamma" value="0.005646641766942668"/>                <parameter key="C" value="349.3487713713193"/>                <parameter key="nu" value="0.43304461037008907"/>                <parameter key="cache_size" value="240"/>                <list key="class_weights"/>                <parameter key="calculate_confidences" value="true"/>              </operator>              <connect from_port="training" to_op="SVM" to_port="training set"/>              <connect from_op="SVM" from_port="model" to_port="model"/>              <portSpacing port="source_training" spacing="0"/>              <portSpacing port="sink_model" spacing="0"/>              <portSpacing port="sink_through 1" spacing="0"/>            </process>            <process expanded="true" height="332" width="330">              <operator activated="true" class="apply_model" compatibility="5.2.008" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">                <list key="application_parameters"/>              </operator>              <operator activated="true" class="performance_regression" compatibility="5.2.008" expanded="true" height="76" name="Performance" width="90" x="187" y="30">                <parameter key="main_criterion" value="root_mean_squared_error"/>                <parameter key="absolute_error" value="true"/>                <parameter key="relative_error" value="true"/>                <parameter key="correlation" value="true"/>                <parameter key="spearman_rho" value="true"/>                <parameter key="kendall_tau" value="true"/>              </operator>              <connect from_port="model" to_op="Apply Model" to_port="model"/>              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>              <portSpacing port="source_model" spacing="0"/>              <portSpacing port="source_test set" spacing="0"/>              <portSpacing port="source_through 1" spacing="0"/>              <portSpacing port="sink_averagable 1" spacing="0"/>              <portSpacing port="sink_averagable 2" spacing="0"/>            </process>          </operator>          <operator activated="true" class="performance_attribute_count" compatibility="5.2.008" expanded="true" height="76" name="Performance (2)" width="90" x="94" y="201"/>          <operator activated="true" class="combine_performances" compatibility="5.2.008" expanded="true" height="60" name="Performance (3)" width="90" x="232" y="194">            <list key="criteria_weights">              <parameter key="root_mean_squared_error" value="0.7"/>              <parameter key="number_of_attributes" value="0.3"/>            </list>          </operator>          <operator activated="true" class="log" compatibility="5.2.008" expanded="true" height="76" name="Log" width="90" x="380" y="165">            <parameter key="filename" value="C:\Users\KenanB\Desktop\TEZ\log\log1.txt"/>            <list key="log">              <parameter key="App" value="operator.Validation.value.applycount"/>              <parameter key="C" value="operator.SVM.parameter.C"/>              <parameter key="Nu" value="operator.SVM.parameter.nu"/>              <parameter key="Gamma" value="operator.SVM.parameter.gamma"/>              <parameter key="RMSE" value="operator.Performance.value.root_mean_squared_error"/>              <parameter key="MAPE" value="operator.Performance.value.relative_error"/>              <parameter key="MAE" value="operator.Performance.value.absolute_error"/>              <parameter key="Corr" value="operator.Performance.value.correlation"/>              <parameter key="K" value="operator.Select by Weights.parameter.k"/>            </list>          </operator>          <connect from_port="input 1" to_op="Select by Weights" to_port="weights"/>          <connect from_port="input 2" to_op="Select by Weights" to_port="example set input"/>          <connect from_op="Select by Weights" from_port="example set output" to_op="Validation" to_port="training"/>          <connect from_op="Validation" from_port="model" to_port="result 1"/>          <connect from_op="Validation" from_port="training" to_op="Performance (2)" to_port="example set"/>          <connect from_op="Validation" from_port="averagable 1" to_op="Performance (2)" to_port="performance"/>          <connect from_op="Performance (2)" from_port="performance" to_op="Performance (3)" to_port="performance"/>          <connect from_op="Performance (3)" from_port="performance" to_op="Log" to_port="through 1"/>          <connect from_op="Log" from_port="through 1" to_port="performance"/>          <portSpacing port="source_input 1" spacing="0"/>          <portSpacing port="source_input 2" spacing="0"/>          <portSpacing port="source_input 3" spacing="0"/>          <portSpacing port="sink_performance" spacing="0"/>          <portSpacing port="sink_result 1" spacing="0"/>          <portSpacing port="sink_result 2" spacing="0"/>          <portSpacing port="sink_result 3" spacing="0"/>        </process>      </operator>      <connect from_op="Read Excel" from_port="output" to_op="Weight by SVM" to_port="example set"/>      <connect from_op="Weight by SVM" from_port="weights" to_op="Optimize Parameters (Evolutionary)" to_port="input 1"/>      <connect from_op="Weight by SVM" from_port="example set" to_op="Optimize Parameters (Evolutionary)" to_port="input 2"/>      <connect from_op="Optimize Parameters (Evolutionary)" from_port="performance" to_port="result 1"/>      <connect from_op="Optimize Parameters (Evolutionary)" from_port="parameter" to_port="result 3"/>      <connect from_op="Optimize Parameters (Evolutionary)" from_port="result 1" to_port="result 2"/>      <connect from_op="Optimize Parameters (Evolutionary)" from_port="result 2" to_port="result 4"/>      <portSpacing port="source_input 1" spacing="0"/>      <portSpacing port="sink_result 1" spacing="0"/>      <portSpacing port="sink_result 2" spacing="0"/>      <portSpacing port="sink_result 3" spacing="0"/>      <portSpacing port="sink_result 4" spacing="0"/>      <portSpacing port="sink_result 5" spacing="0"/>    </process>  </operator></process>`

• Options
RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
Hi,

the Combine Performances operator should "know" the direction of the criteria. The new documentation states:

"It should be noted that some criteria values are considered positive by this operator e.g. accuracy. On the other hand some criteria values (usually error related) are considered negative by this operator e.g. relative error.."

Did you try if that is true for your problem? Maybe you can test it in a small example process.

Best regards,
Marius