No checkpoints?

Message boards : Number crunching : No checkpoints?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Zolt�n

Send message
Joined: 27 Jan 21
Posts: 2
Credit: 62,625
RAC: 0
Message 791 - Posted: 16 Apr 2021, 18:36:32 UTC - in response to Message 788.  

Me too, I quit until chechpoint issure is resolved.
ID: 791 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dad

Send message
Joined: 6 Nov 20
Posts: 2
Credit: 2,519,470
RAC: 0
Message 793 - Posted: 18 Apr 2021, 8:59:50 UTC

Help, Natalia !
Add some check-points.if you can
You are our only hope !
ID: 793 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Falconet

Send message
Joined: 24 Oct 20
Posts: 23
Credit: 9,020
RAC: 0
Message 794 - Posted: 18 Apr 2021, 11:51:17 UTC

As stated on the forums, they are working on checkpoints.

https://www.sidock.si/sidock/forum_thread.php?id=105&postid=776#776
ID: 794 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Havis

Send message
Joined: 2 Mar 21
Posts: 5
Credit: 26,113,755
RAC: 27,511
Message 796 - Posted: 19 Apr 2021, 9:50:54 UTC

This is great project! Please please add checkpoint support, thanks ;-)
ID: 796 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Havis

Send message
Joined: 2 Mar 21
Posts: 5
Credit: 26,113,755
RAC: 27,511
Message 797 - Posted: 19 Apr 2021, 9:52:16 UTC

This is great project! Please please add checkpoint support. Thanks ;-)
ID: 797 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Crtomir
Volunteer moderator
Project developer
Project scientist

Send message
Joined: 11 Nov 20
Posts: 47
Credit: 83,493
RAC: 0
Message 801 - Posted: 19 Apr 2021, 15:00:57 UTC - in response to Message 797.  

Dear Boincers,

we are working to solve the checkpoint issue, which is critical in the case of long-running docking problems.
E-protein has a relatively large binding site and it takes more time to sample configurations/conformations of
the ligand within the binding site. The suggestions to split WU into smaller ones seems a good idea, in this
intermediate time.

To provide better experience and some changes will be done soon:

* we will add new server (s) to the project
* we will increase diskspace (this is probably the most urgent)
* I hope that we will hardcoded checkpointing very soon (very top on our task list)

All the best,

Crtomir
ID: 801 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 13 Jan 21
Posts: 76
Credit: 38,846,214
RAC: 0
Message 849 - Posted: 2 May 2021, 10:13:12 UTC
Last modified: 2 May 2021, 10:17:33 UTC

Does SiDock use adaptive replication (Q2R2 changes to Q1R1) or does it always stay as Q2R2 ???
ID: 849 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
crashtech

Send message
Joined: 5 Jan 21
Posts: 7
Credit: 23,392,338
RAC: 66,801
Message 871 - Posted: 3 May 2021, 13:24:55 UTC

There was a power outage this morning, which in my case means I have totally lost hundreds of hours of computing., approximately the work of 220 logical cores over the span of 12 hours...
ID: 871 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 9 Oct 20
Posts: 185
Credit: 2,782,517
RAC: 50
Message 875 - Posted: 4 May 2021, 9:17:30 UTC - in response to Message 849.  

Even with Q2R2, we still need to recompute certain tasks. Quorum = 2 helps a lot.
ID: 875 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bozz4science

Send message
Joined: 31 Oct 20
Posts: 32
Credit: 847,529
RAC: 123
Message 877 - Posted: 4 May 2021, 9:52:32 UTC

Approaching the BOINC Pentalon, I am curious as to where you stand on checkpointing. What is the issue holding you up atm to implement this feature?
ID: 877 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Natalia
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 9 Oct 20
Posts: 185
Credit: 2,782,517
RAC: 50
Message 879 - Posted: 4 May 2021, 10:18:14 UTC - in response to Message 877.  

The team needs to implement them in CmDock for all platforms, build and test. It is not a good idea to hurry just before the competition...
ID: 879 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bozz4science

Send message
Joined: 31 Oct 20
Posts: 32
Credit: 847,529
RAC: 123
Message 882 - Posted: 4 May 2021, 11:46:28 UTC - in response to Message 879.  

I second that thought. Might be wise to hold back on this plan for the while being. But just a litlle longer or else I fear that you might lose volunteers with powerful systems who understandibly get angry with you about loads of kWh being potentially lost.
ID: 882 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Michael H.W. Weber
Avatar

Send message
Joined: 7 Nov 20
Posts: 8
Credit: 11,239,887
RAC: 6,266
Message 889 - Posted: 5 May 2021, 7:55:53 UTC

The CmDock GitLab repo just reported the appearance of a new version with the desperately awaited checkpointing feature.
So, things are well in the works. ;-)

Michael.
President of Rechenkraft.net - This world's first and largest distributed computing organization. We make those things possible that supercomputers don't.
ID: 889 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile vaughan

Send message
Joined: 22 Nov 20
Posts: 10
Credit: 13,167,232
RAC: 127
Message 916 - Posted: 15 May 2021, 4:58:32 UTC

I concur with KPX as all my Win 10 machines decided to do Windows updates and the obligatory reboot cycle at 3am so I lost all the tasks in progress.

Please bring in checkpoints and in the time we wait for the application to be upgraded send out smaller tasks (shorter run times).Many take over 6 hours per core.
ID: 916 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42

Send message
Joined: 12 Jan 21
Posts: 13
Credit: 2,513,888
RAC: 0
Message 951 - Posted: 22 May 2021, 17:23:36 UTC

It is not writing checkpoints correctly:(win8.1, latest boinc Version, checkpoint Intervall 600 sec.

Application CurieMarieDock on BOINC + zipped input 2.00
Name corona_Eprot_v1_nb3di_203441_1
StateRunning
Received 21/05/2021 10:56:34
Report deadline25/05/2021 10:56:37
Estimated computation size 50,000 GFLOPs
CPU time 08:45:21
CPU time since checkpoint 06:24:52
Elapsed time08:47:10
Estimated time remaining 07:34:02
Fraction done 53.727%
Virtual memory size 148.43 MB
Working set size 149.75 MB
Directory slots/1
Process ID 3924
Progress rate 6.120% per hour
Executable cmdock-boinc-zip_wrapper_2.0_windows_x86_64.exe
ID: 951 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
hoarfrost
Volunteer moderator
Project administrator
Project developer

Send message
Joined: 11 Oct 20
Posts: 338
Credit: 25,686,198
RAC: 9,023
Message 952 - Posted: 22 May 2021, 22:09:06 UTC

Hello! Try to check file docking_out.chk in appropriate slot directory. If you see a number (between 1 and 500) and name of ligand inside file, checkpoints - are created. :)
ID: 952 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JagDoc

Send message
Joined: 24 Oct 20
Posts: 19
Credit: 11,101,112
RAC: 52,416
Message 953 - Posted: 23 May 2021, 7:50:51 UTC
Last modified: 23 May 2021, 7:59:10 UTC

With windows7 Boinc dont shows the checkpoints correct:
Projekt	SiDock@home
	
Name	corona_Eprot_v1_nb3di_206123_2_1
	
Anwendung	CurieMarieDock on BOINC + zipped input 2.00
Arbeitspaketname	corona_Eprot_v1_nb3di_206123_2
Status	Aktiv
Erhalten	23.05.2021 00:10:17
Deadline	28.05.2021 00:10:00
Geschätzte Anwendungsgeschwindigkeit	1,63 GFLOPs/sec
Geschätzte Arbeitspaketgröße	50.000 GFLOPs
CPU-Zeit beim letzten Checkpoint	00:09:39
CPU-Zeit	09:00:41
Vergangene Zeit	08:31:45
Geschätzte verbleibende Zeit	04:58:35
Fortschritt	63,153%
Größe des virtuellen Speichers	147,75 MB
Größe des Arbeitspakets im Speicher	151,31 MB
Verzeichnis	slots/2
Prozess-ID	2765056

The docking _log and the docking_out.chk shows 462 and the same ligand name.

On my ARMs with Linux BoincTasks shows the checkpoints correct.
ID: 953 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rasputin42

Send message
Joined: 12 Jan 21
Posts: 13
Credit: 2,513,888
RAC: 0
Message 954 - Posted: 23 May 2021, 8:09:15 UTC - in response to Message 952.  

Thanks.

The first number is counting up.(xxx ZINCyyyyyyyyyyyyyy)
However, the checkpoints appear to be written every 2 min, ignoring the boinc setting.
Inerestingly the change date of the file you mentioned is not updated, even thoug the contense of it is changing.
ID: 954 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Drago75

Send message
Joined: 6 Feb 21
Posts: 2
Credit: 5,083,832
RAC: 0
Message 977 - Posted: 27 May 2021, 11:10:30 UTC

Congratulations on adding checkpoints! Although not all seems to be perfect yet, this is a great relieve for crunshers like me who don't have their machines up and running 24/7. From now on I will dedicate them more towards your project.
ID: 977 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dr Who Fan
Avatar

Send message
Joined: 24 Oct 20
Posts: 19
Credit: 458,162
RAC: 0
Message 978 - Posted: 27 May 2021, 14:14:52 UTC

I still don't see any use of checkpoints on Windows PC's. BOINC always shows ZERO and time since checkpoints as run time.

ID: 978 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : No checkpoints?

©2024 SiDock@home Team