This project is read-only.

Multiple Jobs on Clusters

Apr 17, 2013 at 11:01 PM
Hi All,

Quick question for anyone who might be knowledgable. Lately, I was about to start some C# coded jobs on a cluster and have once again been thinking about how annoying it is to spawn jobs, collect the results, and process them. Right now, my approach is pretty ad hoc in the sense that I am essentially sending commands to the LSF queue with the command line and polling the results, also with the command line.

Would be curious to know if anyone else is doing work on clusters, and if so how they are doing there scatter/gather so to speak.

-Nigel
Apr 18, 2013 at 12:05 AM
Hi Nigel

I don´t if anyone else make something like i did, i made a little Windows HPC cluster, using windows server 2008 R2, with one head node and 3 workers nodes, with 32 GB RAM and 28 intel cores specifically to work with MBF tools and later with .NET BIO 1.0 tools, I have made a lot of tests with padena using diferent sequences. I was maybe the first persons (maybe the first) who ask to .NET Bio developers team a quick solution for padena performance optimization and try to join to the intiative without success until now (I think is so dificult have the Developer label into this project, and maybe this issue discourage me a little, I follow the project since almost 2 years and a half and I continue in the project, because i think .NET Bio has a lot of future and i'm still in love of microsoft technologies, so i'm working now in visualization technologies with a amazing tools like BLIP, with an amazing guy like Vince Forgetta). I have certain experience in this topic, and i will be happy to have a discussion about this. about the "how they are doing there scatter/gather " is not very clear for me, i don't know if you are talking about how to launch job more efficiently and get the information of a organized way.



Lets talk about .Net Bio over cluster arquitechture,


regards,

Leo
Apr 18, 2013 at 1:35 AM
Hi Leo,

Thanks for the response, by scatter/gather, I just meant how do you distribute tasks to several computers and then gather the results to continue with the analysis in one central process.

Just played with BliP a bit online, looks quite nice!

Cheers,
Nigel
Apr 18, 2013 at 2:44 AM
Edited Apr 18, 2013 at 3:12 AM
Hi Nigel

I dont know if my work with Windows cluster may help you, if is this, please let me know, i'm very interested in continue with this, I decided suspend the project due to I from latin america and looks like there's nobody with experience in this area, but i wish almost i think is my scientific duty. I made something like you said using HPC Pack 2008 R2. I can tell you about my experience when you want.
cheers,

Leo
Apr 18, 2013 at 4:06 AM
Hey Leo, thanks for offering, I don't have access to an HPC cluster unfortunately, if you ever want to chat LSF let me know though!
.
Apr 18, 2013 at 4:11 PM
Hi Leonardo,

If there are any more questions I can answer for you regarding membership in .NET Bio, do please let me know. I know we have had several conversations offline about this issue and I believe you have also spoken to Rick and the CodePlex administrators as well.

To summarize though - you are already a developer. there is no specific developer role, all you need to do is create some code and submit it for review. One of the committers on the project will review your contribution and work with you if necessary to make sure it conforms with coding standards and other issues (I believe Nigel is doing that with Mark right now). If the code works and meets the standards you can read on our documentation page, it will be checked in. After a few successful checkins, we may grant you access to the repository so you can review the code of other people. That's how the community works. There is no developer role.

Let me know if you have any further questions - or if other people have had experience with code submissions and want to add their views please feel free.

Simon
Apr 18, 2013 at 6:41 PM
Hi Simon

I want to be clear, i'm grateful with you and Rick, for your answers and help into my process, thanks again your valuable and oportune answer , like always. I know too, I'm beginning in the process of make code submissions and i have worked never with TFS. I have read the documentation about the developer and committer process and I know the process because you also tell me about this. i'm working now to contribute with code, but i think is dificult because if a person don't have direct acceses (in real time) to repository is maybe blind respect to other contributions in sense another people maybe is working in the same issue (Padena case), even if there exists the discussion tab, also i hope you know that make a important code contribution is not a easy and fast process moreover for someone like me who's an bioinformatics begginer (i have only 2 years and a half, studing this topic), however i know too if every person who are next to project become in developer person, it will be a mess.

Leo