Linux: Zipping up a group of files after X days

Associate
Joined
19 Jun 2003
Posts
1,680
Location
West Yorks, UK
Hi all,
I have an FTP directory in which files are continuously uploaded each day (say around 40 or so). They all have the same filename structure, e.g.:
Code:
dep1_20070228085845.csv

Basically, the first 4 characters group the files (there are "dep" and "par" files, with numbers from 1-3). The remaining characters after the underscore are the date and time the file was generated, in reverse format (so the example above is 28/02/2007 8:58:45am).

I want to have a cron task that runs every night to bunch any files that are 7 days or older into a .zip file for that date. So, i would end up with the following:
Code:
dep1_2007-02-28.zip
par2_2007-02-28.zip
dep1_2007-02-27.zip
etc etc....

How can I achieve this with standard Linux commands?

Cheers,
Matt
 
Quick and dirty shell script, put
Code:
#!/bin/bash

#Gets the date (YYYY-MM-DD format)
date=`date +%F`
echo $date;

#GNU date makes this easy :-)
lastweek=`date --date="1 week ago" +%F`

#Zips common named files up
zip -t $lastweek "/wherever/you/want/dep1_$date" /your/upload/directory/dep1*
zip -t $lastweek "/wherever/you/want/dep2_$date" /your/upload/directory/dep2*
zip -t $lastweek "/wherever/you/want/dep3_$date" /your/upload/directory/dep3*
zip -t $lastweek "/wherever/you/want/par1_$date" /your/upload/directory/par1*
zip -t $lastweek "/wherever/you/want/par2_$date" /your/upload/directory/par2*
zip -t $lastweek "/wherever/you/want/par3_$date" /your/upload/directory/par3*

In a shell script and crontab that for each night, direct the output to a log file if you need (should be each filename as it's added). zip -t keys on file creation which should be OK for your purposes.
 
Hi,
Thanks very much for the reply. Unfortunately, i've been a bit thick. The files are actually Directories, with up to 4 files inside, e.g.
Code:
/archive
    /dep1_20070228134513/
         file1.txt
         file2.txt
         file3.txt
         file4.txt
    /dep2_20070228091241/
         file1.txt
         file2.txt
         file3.txt
         file4.txt
etc....

How does that affect the script?

Cheers,
Matt
 
I'm a total knob at shell scripting but methinks all you need to do is add a "-r" to make it recursive.
Code:
#!/bin/bash

#Gets the date (YYYY-MM-DD format)
date=`date +%F`
echo $date;

#GNU date makes this easy :-)
lastweek=`date --date="1 week ago" +%F`

#Zips common named files up
zip -t -r $lastweek "/wherever/you/want/dep1_$date" /your/upload/directory/dep1*
zip -t -r $lastweek "/wherever/you/want/dep2_$date" /your/upload/directory/dep2*
zip -t -r $lastweek "/wherever/you/want/dep3_$date" /your/upload/directory/dep3*
zip -t -r $lastweek "/wherever/you/want/par1_$date" /your/upload/directory/par1*
zip -t -r $lastweek "/wherever/you/want/par2_$date" /your/upload/directory/par2*
zip -t -r $lastweek "/wherever/you/want/par3_$date" /your/upload/directory/par3*
 
Changing the zip commands to
Code:
zip -r -i \*.txt -t $lastweek ...
should work as the file/directory modification date check will still apply. I'd run one of the zip lines by itself with $lastweek replaced with a proper date, see what you get and tweak if necessary :).
 
BillytheImpaler said:
I'm a total knob at shell scripting but methinks all you need to do is add a "-r" to make it recursive.

I'm not convinced you're right here...

The files are stored like so:

/archive
/dep1_20070228134513/

Your script doesn't take account that system date format might be different. Those filenames are produced using the following format:

date +%Y%m%d%H%M%S

Surely the code should read something more like:

I'll tidy up this slightly as well, using "date" as a variable name is a little ugly, and generally not good practice, particularly once you start writing more complicated scripts!

Code:
#!/bin/bash

#Gets the date (YYYY-MM-DD format)
curdate=`date +%F`
echo $date;

#GNU date makes this easy :-)
lastweek=`date --date="1 week ago" +%Y%m%d`

#Zips common named files up
zip -t -r "/path/to/destination/dep1_$curdate" /source/archive/dep1_$lastweek*
zip -t -r "/path/to/destination/dep2_$curdate" /source/archive/dep2_$lastweek*
zip -t -r "/path/to/destination/dep3_$curdate" /source/archive/dep3_$lastweek*
zip -t -r "/path/to/destination/par1_$curdate" /source/archive/par1_$lastweek*
zip -t -r "/path/to/destination/par2_$curdate" /source/archive/par2_$lastweek*
zip -t -r "/path/to/destination/par3_$curdate" /source/archive/par3_$lastweek*

We could probably tidy that zip block up with a for loop too.
 
Last edited:
Garp said:
I'm not convinced you're right here...

Me either, zip -t (or -tt which might be the original intention) needs a date which you haven't provided, and won't your version only get files from exactly one week ago, rather than all files older than that?
 
AndrewP said:
and won't your version only get files from exactly one week ago, rather than all files older than that?

I was wondering about that too, and was originally going to account for that in my code but figured that as the script is running in a daily cron job its not really necessary, it would just require someone to manually zip up all the old stuff manually, and then leave the cron job from that point on.

Otherwise I guess we'd have to start walking down the "find . -name blah -mtime +7" route?
 
Garp said:
Otherwise I guess we'd have to start walking down the "find . -name blah -mtime +7" route?

I was going to suggest this actually, you can get it to execute a command afterwards with
"find . -name blah -mtime +7" exec <command>" etc.
I am fairly sure you can use that to zip stuff up.
 
BruceLee said:
I was going to suggest this actually, you can get it to execute a command afterwards with
"find . -name blah -mtime +7" exec <command>" etc.
I am fairly sure you can use that to zip stuff up.


was going to suggest this also though both methods are equally effective, more room for debug in the little script.
 
Yeah, the find method seems to be a nice efficient way of doing things. I can run the following manually, and it works a treat:
Code:
find . -type d -name "dep1_20070116*" -print | zip -r -o -m -T dep1_20070116 -@
That will find appropriately named directories, pass the list to the zip function, which zip's them up recursively. The other parameters set the last modification date/time of the zip file (-o), delete the source files once zipped (-m), and checks the zip file integrity before finishing off (-T).

So, i've changed my process slightly. I'm going to have another script which moves the files to be zipped into an archive folder. I can then either do this manually, or just set it to move files older than 7 days in.

So the thing I need to work out now is how to substitute the filenames in the above command - i.e., how I can make the script look through a directory, find a group of folders (20070204* for example), and then use this filename for the zip filename?

Matt
 
feenster99 said:
So the thing I need to work out now is how to substitute the filenames in the above command - i.e., how I can make the script look through a directory, find a group of folders (20070204* for example), and then use this filename for the zip filename?

Okay, good call on the change.. I reckon possibly the best way to achieve this would be to use the ` symbol (as in the one next to the 1 key, not the one on the @ key) This allows you to execute a command part way through an existing one and pass that back.

This is an ugly way to demonstrate this..

Code:
vi `locate php.ini`

That would execute the locate php.ini command first and pass its response back to vi as the first argument; so in this case looking for every instance of php.ini and opening them for editing in vi.

So what we can do here is take your code:

Code:
find . -type d -name "dep1_20070116*" -print | zip -r -o -m -T dep1_20070116 -@

and then just slot in the date --date="1 week ago" +%Y%m%d in the appropriate place:

Code:
find . -type f -name dep1_`date --date="1 week ago" +%Y%m%d`* -print | zip -r -o -m -T dep1_20070116 -@
 
Back
Top Bottom