Due to recent updates, all users are required to create an Altair One account to login to the RapidMiner community. Click the Register button to create your account using the same email that you have previously used to login to the RapidMiner community. This will ensure that any previously created content will be synced to your Altair One account. Once you login, you will be asked to provide a username that identifies you to other Community users. Email us at Community with questions.

dependency management for plugins?

holgerholger Member Posts: 42 Contributor II
edited November 2018 in Help
Hi,

I'm currently working on some rm5 plugins which wrap some internal inhouse-only-used libraries/apps into rm-operators. As these libraries partially have huge dependency trees (~30 jars) which are partially overlapping with the rm-dependencies, my question is how to deploy the plugins? Is there a way to embed the jars of the dependencies into the plugin-jar? Or do I need to mess up the rm5-lib-directory? Do i need to be careful to avoid that my plugin-dependcies mess up the class-path of RM itself?

I've noticed that for the weka-plugin, the weka.jar was simply unzipped into the extension jar. The same seems to be done for the reporting-plugin. However for more complex plugins, this approach seems to be not feasible/elegant if many dependencies are required.

I've scanned through the (partially outdated) rm4.4 docu but even if some sections (like 2.5) touched the topic, the solution is not yet clear to me.

Personally, I consider the issue of dependency management a core issue of any plugin model. Only if it is solved properly, people are likely to start to develop plugins.

Any help/hint/idea/link is welcome. :-)

Best, Holger
Tagged:

Answers

  • fischerfischer Member Posts: 439 Maven
    Hi Holger,

    you certainly have a point there. The dependency management is limited, and currently there is no other way to include library jars. There are frameworks around, but currently I don't see that any of these suit the needs we have for certain applications. Up to now, there has been no demand for a more sophisticated dependency management, but we can certainly extend this. If you are interested any kind of cooperation regarding this issue please get back to me.

    Cheers,
    Simon
  • holgerholger Member Posts: 42 Contributor II
    Hi Simon,

    I've googled around and this it what needs to be done imho:

    1) First search locally for the classes in the plugin-dependency jars and then delegate to the parent class loader:
    http://fdt.powerflasher.com/blog/?tag=urlclassloader
    This (the loadClass()-implementation in the linked article)  makes the plugin-system to prefer plugin-dependency-classes against RM-dependency ones.

    2) URLs access to nested .jar files
    http://zamboch.blogspot.com/2009/03/url-access-to-nested-jar-files.html

    3) Adapt the PluginClassLoader/Plugin-classes accordingly.
    - Add an iterator over the plugin-jar lib dir to create urls for all plugin-dependencies.

    What do you think? I could help out a little to implement it, but would probably your support as your much more familiar with RM than me.

    -Holger
  • fischerfischer Member Posts: 439 Maven
    Hi,

    ad 1) This is what the PluginClassLoader does. In fact the class loader first loads classes provided by the plugin, then classes provided by the plugin it depends on, then classes provided by the RM core.
    ad 2) This is nice. Still, it does not help for sharing utility jars among plugins.
    ad 3) This is what the AllPluginsClassLoader does.

    What is in fact missing is a way to share utility jars, and to use proper versions for dependencies . We have to think about all this a bit still, since we must have a solution that is independent of physical jar files. E.g., when you want to run RM as a server, you want plugins to be dynamically deployable and probably you don't want them to be stored in files.

    Still, any help is appreciated.

    Cheers,
    Simon

  • holgerholger Member Posts: 42 Contributor II
    Hi Simon,

    thanks for your comments.

    here's a patch which contains a working draft of a better way for plugin-deployment.:
    http://idisk.mpi-cbg.de/~brandl/loadjarsfromjars.patch

    It contains the following changes:

    - don't patch plugin-jar URL in Plugin.java (because there's no need to do so)
    - don't unjar plugin-dependencies but copy them into lib-folder of jar
    - load plugin-dependcy-jars directly from plugin-jars by using special url-protocol-handler

    It works for me. :-)  However, this is what is missing:
    - plugin-dependencies are not yet migrated to the new model

    > use proper versions for dependencies
    what do you mean with that. Should plugin-depencies include the version of the referred plugin (which would make sense to me)?

    > you want plugins to be dynamically deployable and probably you don't want them to be stored in files
    This is not clear to me. Even if RM is running on a server, classes need to be located somewhere in a file? or do you want to deploy remote plugins where you just have an url and that's it?

    I could work on the missing piece which are the cross-plugin-dependencies. For sure only, if you're actually interested to integrate my patches in the RM-codebase as soon as they become stable.

    The fix should probably include to split Plugin into a PluginFactory (called by RapidMiner.java) which includes all the static stuff, and a cleaned up Plugin class which just contains plugin-instance specific stuff. IMHO this would greatly increase the code-quality. What do you think?

    -Holger
  • holgerholger Member Posts: 42 Contributor II
    Any progress with the patch? Do you need help with it? Did it cause any problems? We're working with it (custom build) already and everything seems to be fine.

    -Holger
  • fischerfischer Member Posts: 439 Maven
    Hi Holger,

    the patch you sent certainly solves the packaging problem of jars. However, it does not address the core issue of having shared dependent libraries between several plugins. You still would package them into both plugins, which would still result in conclicting and, foremost, confusing class loading issues.

    Also, I wonder why we need this custom JarJar url handler. This is an issue faced when loading classes from wars, ears, etc. so I guess there must be a standard solution, but I am not aware of this.

    Finally, it is still not clear to me if your approach is feasible if the jar does not live in a file, but, e.g. in memory or in a database.

    I still have to think about that.

    Cheers,
    Simon
  • holgerholger Member Posts: 42 Contributor II
    Hi Simon,

    thanks for your response.

    Concerning the inter-plugin, dependencies.
    I would work out a fix, if you think that this issue should be resolved for RM and my patch is likely to become integrated. IMHO the solution would look like this: if a plugin A is said to dependent on another plugin B, all urls of the pluginB-classloader would be added also to the classloader-urls of pluginA.
    The only thing to be prevented are infinite loops in case of cyclic dependencies (which should not occur normally , but you never know what happens, and it is thus just more robust to do it)

    Surely, with this approach, static fields would not be shared between plugins, but as static fields are bad design anyway, it think it is a proper solution.

    If you're willing to integrate an according patch, please let me know, and I'll work it, test it and send it to you asap.


    Concerning "Also, I wonder why we need this custom JarJar url handler."
    without the handler there's no way to make plugin-dependencies which are located within a plugin.jar accessible to the pluginClassLoader. As the classloader extends UrlClassLoader any class-path entry needs to be a url. The JarJar just defines another way to define urls, which allows to reference jars within jars.

    Concerning "jar does not live in a file, but, e.g. in memory or in a database."
    What is actually necessary is that you have an appropriate urlconnection-implementation for each case. Otherwise the plugin-classloader can not handle it. This is the current situation and would not change by integrating the patch.

    Another good thing about my patch is that it is still compatible to the existing plugins where the dependencies are unjared into the plugin.jar. However, I think that redeploying them would make live easier (faster deployment, decoupling of plugin-classes and dependency-classes within the plugin-package, etc.)

    Please let me know if you've further questions and ideas.

    best, Holger
  • fischerfischer Member Posts: 439 Maven
    Hi.

    I partially agree to your solution.

    However, we have still to clarify some things. Let's introduce some terms so we speak about the same things:

    - We have plugin jars containing the actual plugin code.
    - We have additional utility jars on which the plugins depend.

    Utility jars should, in my opinion, not be bundled with the plugins, neither re-jarred nor in the lib folder since this way we cannot share utility jars. So, utility jars go to, say, plugins/lib or plugins/ext, or whatsoever. All jars in plugins/lib are added to the URL classloader of all plugins. Better yet: We have a parent classloader for all plugins for these utility jars. In that case, we don't even have a problem with static fields. Plugin classloaders thus only need to resolve dependencies among plugins.

    Such a solution I would integrate with RapidMiner, but before you start developing get back to me so we can give you a lock on the plugin package :-)

    Still open problems are:
    - Versioning
    - How do we deploy to sourceforge? Utility jars separate? Installer? Actually, we wanted to avoid installers for plugins.
    - How do we deploy via update server? I think I can address that problem.

    Cheers,
    Simon

    P.S. Again replying to the custom JarJarClassLoader: My question was not, why we need it but rather why there is no "standard" implementation for that. After all, it is a common problem. I remember I searched for that some time ago, but came up with no standard solution.
  • holgerholger Member Posts: 42 Contributor II
    hi Simon,

    thanks for your response.
    Utility jars should, in my opinion, not be bundled with the plugins, neither re-jarred nor in the lib folder since this way we cannot share utility jars.
    Why? What would the benefits?
    Imho they SHOULD be bundled because of the three reasons, which you pointed out correctly (even if you phrased them as questions)

    1.
    - Versioning
    If utility jars are part of the plugin (which is perfectly possible because of jarjar-url-connection and the custom class-loader per class), plugin-devs are free to rely on whatever version of utility jars they want to. There are per definition no versioning conflicts because each plugin manages its own utility-dependency tree.

    2.

    - How do we deploy to sourceforge? Utility jars separate? Installer? Actually, we wanted to avoid installers for plugins.
    If utility jars are part of the plugin-jar (as I suggested above and as its implemented by the patch) there's no need to deploy them separately (which would make deployment unnecessarily complex imho).

    3.
    - How do we deploy via update server? I think I can address that problem.
    Everything can remain as it is if utility-jars are bundled. The patch I've provided, does not require any changes. It simply adds the possibility to bundle utility jars with a plugin in without unjaring them.

    Actually I've seen the "solution" which puts all utility jars into a common plugins/ext folder in another opensource project (fiji) and it turned out to be a bad idea because of all 3 above mentioned reasons. Having a shared utility-jar class loader even makes things more complicated, as it requires that all plugins use the same version of a particular utility-library.
    I think it's a widely supported design principle to avoid static fields as much as possible, so the fact that they are not shared between plugin-class-loaders is a feature not a bug.

    The only issue which we may clarify in more detail (before I can revisit the patch if necessary), is how inter-plugin-dependencies should be done. RM already provides this option , and by changing it as described in my last post (->add all urls of a plugin-dependency to the url-classpath of the plugin itself), I think the plugin-architecture of RM would become really powerful.

    lg, Holger

    ps. concerning  jarjar and "why  there is no "standard" implementation for that." Good question. I think it's a flaw of the URLClassLoader implementation in the jdk. And even if I don't know the specs in detail, I think the issue of deploying apps/plugins with dependencies is addressed in java7 by introducing a new packaging-format which is designed  to replaced jar-files.
  • holgerholger Member Posts: 42 Contributor II
    Hi Simon,

    I've just tried to patch the plugin-dependency problem, and created another patch. It patches the basic plugin-dependency management (no dag-dependencies yet, but just simple dependencies without no further subdependencies). It also includes the loading of embedded utility jars directly from plugin-jars as described above.

    http://dl.dropbox.com/u/422074/patched_plugin-dependency_management.patch

    For me the patch is working nicely when creating inter-plugin dependencies, so just give it a try.

    This is what I've observed:
    1) the plugin class needs to be split into a factory and a clean plugin-class. Currently it is hard to understand
    2) the distinction between managed and non-managed dependencies is bad design. Plugins should simple have a property "update-server which allows them to be updated. RM should work like Eclipse here, which allows several update-servers. IMHO this should be not that tricky to do.
    3) the dependency-version should be minimal version as an exact match is unlikely to happen and would require to upgrade everything for each minor bugfix release of a dependent plugin.


    Tell me what you think. As discussed with Ralf Klinkenberg recently, the release of our r and matlab integration plugins depends on this issue, so the sooner we get it fixed the sooner we can release the plugins. :-)

    Best, Holger
  • fischerfischer Member Posts: 439 Maven
    Hi Holger,

    for your latest patch, all Hunks seem to fail, no idea why. Could you send me the four relevant files by email? I'm fischer at rapid-i dot com.

    Cheers,
    Simon
Sign In or Register to comment.