Distributed scheduling recommended reading?

    Ok, so part of why I'm hating the locking behaviour of MySQL is because I'm playing with scheduling a large distributed job for a personal project. I've talked to some folk about Beowulf, and it doesn't seem to offer me much... Does anyone have any recommended reading on how research clusters solve this sort of problem that they would like to share in the comments?

    Tags for this post: blog(S)

posted at: 14:12 | path: /diary | permanent link to this entry





    Chris Samuel

    From my perspective HPC clusters are doing batch computing with a queueing system (Torque, a well maintained open source version of PBS, in our case) and a scheduler that allocates jobs to nodes (Moab for us, but its free sibling Maui scheduler from ClusterResources is very capable).

    For filesystems, well we still use NFS (quick, easy, works everywhere, probably not ideal) but others use things like Lustre, GPFS, CXFS, etc, with distributed lock managers, etc.

    It all depends on what you are trying to do really - you may be best off approaching Stewart to see if he's got any bright ideas, given he works for MySQL now he should have some.. :-)


    Add a comment to this post:

    Your name:

    Your email: Email me new comments on this post
      (Your email will not be published on this site, and will only be used to contact you directly with a reply to your comment if needed. Oh, and we'll use it to send you new comments on this post it you selected that checkbox.)


    Your website:

    Comments:


    Because of excessive load, this site is generated statically every several hours. Therefore, your comment may take some time to appear here. Unless you get an error message when you click the select button below, then all is normal and the comment will appear in due course.