CAD CAM EDM DRO - Yahoo Group Archive

request for feedback, comments, and suggestions concerning archives.

Posted by terrylr@b...
on 2001-12-20 01:37:32 UTC
hello;

i am rewriting the archive programs i use to archive various mailing lists.
given that yahoo.com has made changes to the way messages are archived is
what has prompted this rewrite. the archive program is being rewritten in perl.

each message is downloaded as an html file. 1000 messages in html format roughly
occupy 8.2megabytes of disk space. there are roughly 36000 messages archived
for cad_cam_edm_dro. the entire cad_cam_edm_dro archive would roughly occupy
300megabytes. 1000 plaintext messages occupy roughly 4.2megabytes of disk space.
the entire cad_cam_edm_dro plaintext archive would roughly occupy 150megabytes
of disk space. disk space savings based on the below proposals are not included
in the above disk space figures. the disk space figures are also for uncompressed
archives. compression savings are based on what compression program is used.
bz2 is better than gzip. gzip is better than pkzip.

i have been testing it on cad_cad_edm_dro and have several proposals concerning
archiving the mailing list. i would appreciate feedback from members of cad_cam_edm_dro
concerning the below proposals.

first some brief background.
cad_cam_edm_dro mailing list has been through several changes of hosting serivce
ownership. there was onelist.com, then egroups.com, and now yahoogroups.com.
several of these hosting services placed 'ads' or 'sponsorship' notices in each
posting to the mailing list.

proposal 0: these ads/sponsorship notices should be stripped out of the archives.
reasons are: 0: many are obsolete/out-of-date/broken links.
1: they waste disk space.
2: given that the archives are plaintext and not html they are
meaningless.

proposal 1: references to previous mailing list hosting services should be stripped
out.
reasons are: 0: they are obsolete.
1: they waste disk space.

proposal 2: messages which include links to items on ebay.com should have all
links and references to ebay stripped out.
reasons are: 0: majority are obsolete/out-of-date/broken links.
1: given that the archives are plaintext and not html they are
meaningless to include.

proposal 3: urls in messages should be stripped out and listed in a companion file.
reasons are: 0: a central location would make searching easier.
1: a central location would allow for validating the urls and allow
for broken urls to be removed and a single file uploaded.
2: reduced new cad_cam_edm_dro members frustration concerning finding
information.
3. would reduced redundant urls from being posted again and again.


i am open to serious feedback, comments, and suggestions.
i would prefer if you reply privately to me concerning this message.
i will summarize the replies to the list after a suitable time has
gone by.

--
Terry L. Ridder
Blue Danube Artistic Forge (Blaue Donau Kunstschmiede)
"We do not bend metal, we sculpt it."

digging deep, i feel my conscience burn
i need to know who and what i am
this hunger jolts me from complacency
rocks me, makes me meet myself
----kendall payne---closer to myself---

Discussion Thread

terrylr@b... 2001-12-20 01:37:32 UTC request for feedback, comments, and suggestions concerning archives. Bill Vance 2001-12-20 05:15:16 UTC Re: [CAD_CAM_EDM_DRO] request for feedback, comments, and suggestions concerning archives. Gail & Bryan Harries 2001-12-20 06:15:10 UTC RE: [CAD_CAM_EDM_DRO] request for feedback, comments, and suggestions concerning archives.