Welcome to Emulationworld

Forum Index | FAQ | New User | Login | Search

*View All Threads*Show in Threaded Mode


SubjectZipMax - New setting to assist in distributed processing Reply to this message
Posted byPacFan
Posted on01/29/04 11:56 AM



Roman --

I have an idea that would be small to implement and greatly increase the functionality of ZipMax.

Ideally, I'd like some Client/Server way to manage distributing the Rezipping over numerou machines connected to a network.. Sending the smallest zip's to remote (and/or slower) machines and woking on the larger zips on the mail (and/or faster) machines.

Obiously that would take a lot of rework.

However, I have an idea that would accomplish this in a different way with probably a couple lines of code and 1 new ini setting.

Here's the plan:
- Add an ini file setting called "SkipReadOnlyFiles=0"
- The default would be 0.
- Add a line of code in the routine parsing the files to check if the readonly bit is set, and IF it is, then skip the file and continue to the next one (Something it might want to do regardless of this feature?)
- If it is not readonly, mark it as read only immediately and start processing
- Mark it as writeable after processing is complete (just before the source zip is updated) or on any failure cleaup routines. Continue to set the Archive bit of course upon succeeding.
- Since the default value would be 0, it is the same functionality as you currently have and wouldn't break anyone who doesn't know how to set/clear the readonly flag (though I'd doubt anyone would have the stuff marked readonly to begin with)

Benefits:
This would then allow you to set up a single file share on the main machine, then from remote machines, highlight the files and drag and drop into zip maxes (you could sort by file size first to get the smaller ones to the remote machines to save network bandwidth or control sizes to smaller processors/etc..)

Drawbacks:
- It does mark the files as read only and could leave something in a bad state if something totally falls apart. But it doesn't delete the file or anything "bad"
- It is possible, though not very likely, for 2 machines to accidentially grab the same file at the same time and mark it readonly and both work on it. (or catch it not marked read only as the original zip is being replaced) This is lessened by network latency and processor speed getting to the same file at the same time.


I'd be glad to make the changes myself as I am a programmer, and in fact did look at the ZipMax code 2 years ago in an early version (and added support on my own machine for 99 zippers before BJW got his update in) However I currently dont have the correct compiler and library versions and figured you could make this update rather quickly if you thought it would work for most people.

You might also chose to use the Hidden flag but I think ReadOnly makes the most sense as those probably should be skipped in any case (even without a flag or this feature)

It would significantly help the speed of recompression... ESPECIALLY when using BJWFlate which is awesome for the extra compression it can do over 7za and numerous other compressors. I have 5 computers at home and 3 at work each networked and this would make the chore of a large rezip easy with virtually no manual segmentation/work to keep things running.

LMK your thoughts and thanks for the work on a great program!


SubjectRe: ZipMax - New setting to assist in distributed processing new Reply to this message
Posted byPacFan
Posted on01/29/04 12:11 PM



Ooops, I forgot to add that you would need to add another flag and subsequent check: SkipIfArchiveSet=0. It would have to be set to 1 as well in order to not reprocess a file that was already done by another machine, so long as you started with all of them with the flag off.

But the two in conjunction (Skip archive, skip read only) would get the distributed functionality working and without having to write a client/server app.


SubjectRe: ZipMax - New setting to assist in distributed processing new Reply to this message
Posted byRoman
Posted on01/29/04 12:40 PM



> Ooops, I forgot to add that you would need to add another flag and subsequent
> check: SkipIfArchiveSet=0. It would have to be set to 1 as well in order to not
> reprocess a file that was already done by another machine, so long as you
> started with all of them with the flag off.
>
> But the two in conjunction (Skip archive, skip read only) would get the
> distributed functionality working and without having to write a client/server
> app.
>


Feel free to add it ;) Ben Jos and me are busy with reallife at the moment. Although I will keep the ideas.

Roman Scherzer
ClrMamePro


SubjectRe: ZipMax - New setting to assist in distributed processing new Reply to this message
Posted byPacFan
Posted on01/29/04 07:14 PM



Okay, I managed to find the right compiler version (7.1 - .NET 2003) and got everything compiled. I have added the functionality and am testing this evening.

If everything goes well I should be able to give you a new version by the end of the weekend after I triple check things.

Thanks again!

PS: Anyone else have any good ideas for ZipMax?


SubjectRe: ZipMax - New setting to assist in distributed processing new Reply to this message
Posted byf205v
Posted on01/30/04 04:31 AM



> PS: Anyone else have any good ideas for ZipMax?

Yes, I've one (even if I do not know if it's good or not)

Add a file-open menu, which can point to a single file or to a directory (eventualy with a toggle to include/exclude sub-directories).

This way user can use the drag-and-drop feature or the file-open feature ad their best convenience.

Just my 2 cents....


PS I've a second one: add some columns in the GUI (eventually with toggles to show/hide them).

Additional columns should be: original_size, final_size_for_pass_1, final_size_for_pass_2, etc.etc.

Eventually only one column can be added: best_compression_pass

This feature only has a statistical interest, I know..


ciao
f205v


SubjectDone -- Multi-Machine processing for ZipMax new Reply to this message
Posted byPacFan
Posted on01/30/04 11:50 AM



I have the changes complete, compiled and tested.

I will email Roman with the changed source code and readme and ini file to be updated to .51 if he approves of the changes.


Here is a description of the feature as implemented:

How can I use multiple computers to process a bunch of files without manually
separating chunks of work or having them process the same file more than once?

Zipmax version .51 introduced 3 new zipmax.ini file settings: SkipReadOnly,
SkipArchive and SetCurrentReadOnly. By setting these all to 1, you will
be able to point 2 or more machines at the same list of files from a network
share and not have to worry about separating them or duplicated processing.

How to use it:
==============
1) Mark all of your zip files NON read only and NON archive ready.

In Windows Explorer, simply highlight all the files and/or folder(s), right
click, choose properties, clear the Attributes: Read Only checkbox. Click
the Advanced button and clear the "File is ready for archiving" checkbox.
Click OK and OK.

From Dos, simply cd into the folder containing the files, and type:

attrib -r -a *.zip

You may use the /s flag to also change the attributes of files in subfolders.


2) Set the ini file settings (SkipReadOnly, SkipArchive, SetCurrentReadOnly)
to 1 and relaunch ZipMax.

3) On the first machine, drag and drop all the files/folder(s) to be
processed into ZipMax.

4) On the second (or more) machine(s), from the same folder over a network,
drag and drop the same list of files/folder(s) to be processed onto a separate
copy of ZipMax on each machine.

5) That's all. As the files are processed, those that are already being
worked on by one machine are tagged ReadOnly, and therefore will be skipped
on all other machines. Also, those files already completed by a machine will
be tagged with the Archive attribute and subsequently skipped on other machines.

By using the zipmax.ini flags, this ensures the same file is not processed
more than once by any machine, wasting time.


Special notes:
==============
Rather than dragging and dropping an entire folder (which sorts the same
on every system) you select the files and choose to drop from large to small
on the FASTEST processor system (or the main system not going over a network)
and from small to large on the slower or networked system.

This will ensure those files that are the largest can finish faster on a fast
system or less data needs to be transmitted over a network (e.g. especially
if you are using wireless).

It also reduces the chance for collision. Athough one system will skip files
already marked A or R, exact timing could cause both systems to grab the file
before changing the status to read only and both work on it, with the faster
one updating it and the slower one receiving an error when attempting to
replace the original zip (now changed by the other system). Tests show only
limited < .5% Error collision on identical ordered lists. Changing the orders
could limit potential collission to 1 total no matter how large the list is.
Please note: Even though one system reports an error (not skip), the other
system will succeed and the file will be compressed and archive bit set.

You must have a separate copy of the zipmax.exe and zipmax.ini (and all
required archives) on each machine IF you plan to write to a log file,
otherwise you can point to the same folder on the main machine)


skipreadonly = 1 or 0. Set it to 1 if you want files marked read only
(R bit set) to be skipped

skiparchive = 1 or 0. Set it to 1 if you want files marked for archive
(A bit set) to be skipped

setcurrentreadonly = 1 or 0. Set it to 1 if you want files that are being
processed to be set to read only. Note: files will revert
to non-readonly when completed even with 0 byte change



SubjectRe: Done -- Multi-Machine processing for ZipMax new Reply to this message
Posted byRoman
Posted on02/02/04 04:13 AM



I will look at it later today ;)

Roman Scherzer
ClrMamePro


View All Threads*Show in Threaded Mode