Re: Lazy man's google search of the group...
Posted by
Dave Kowalczyk
on 2002-11-13 22:23:44 UTC
Jason:
It's pretty trivial to "slurp" the messages with any of a number of
programming languages. Here's how they are linked:
http://groups.yahoo.com/group/CAD_CAM_EDM_DRO/message/52485
Replace "52485" with any integer from 1. There must be some
browser dependencies; when I do this with VB directly to archive the
TurboCNC listgroup posts, the ad messages are never seen by the bot
but I do get all the html, so there's a little code needed to strip
the buttons, etc....
According to the Yahoo TOS, we mustn't "duplicate the Service" or
modify the interface, or sell the postings. Yahoo, in principle,
owns all the intellectual property we post through some fairly
tortious legal language. I don't think this has ever been exercised
though, or even could be!
http://docs.yahoo.com/info/terms/
Sec 16 - "You agree not to access the Service by any means other than
through the interface that is provided by Yahoo for use in accessing
the Service. "
heh! Can't blame them - they're in business like everyone else but
IMHO their "Service" has evolved a certain amount of sanctimonious
suckiness with ad interrupts and the like.
Anyway, my interpretation of their agreement is that it's fine to
download the messages in txt format, zip them up, and stick them in
the files section on the same Yahoo list since this is
not "duplicating the Service" - the messages are accessible on Yahoo
by the usual methods in either case. That's how I do it anyway.
IANAL, so YMMV and all that. You're staring down about 200M of
data if you want all the posts for CCED as uncompressed plaintext.
Dave Kowalczyk
Everett WA
TurboCNC software --> http://www.dakeng.com
It's pretty trivial to "slurp" the messages with any of a number of
programming languages. Here's how they are linked:
http://groups.yahoo.com/group/CAD_CAM_EDM_DRO/message/52485
Replace "52485" with any integer from 1. There must be some
browser dependencies; when I do this with VB directly to archive the
TurboCNC listgroup posts, the ad messages are never seen by the bot
but I do get all the html, so there's a little code needed to strip
the buttons, etc....
According to the Yahoo TOS, we mustn't "duplicate the Service" or
modify the interface, or sell the postings. Yahoo, in principle,
owns all the intellectual property we post through some fairly
tortious legal language. I don't think this has ever been exercised
though, or even could be!
http://docs.yahoo.com/info/terms/
Sec 16 - "You agree not to access the Service by any means other than
through the interface that is provided by Yahoo for use in accessing
the Service. "
heh! Can't blame them - they're in business like everyone else but
IMHO their "Service" has evolved a certain amount of sanctimonious
suckiness with ad interrupts and the like.
Anyway, my interpretation of their agreement is that it's fine to
download the messages in txt format, zip them up, and stick them in
the files section on the same Yahoo list since this is
not "duplicating the Service" - the messages are accessible on Yahoo
by the usual methods in either case. That's how I do it anyway.
IANAL, so YMMV and all that. You're staring down about 200M of
data if you want all the posts for CCED as uncompressed plaintext.
Dave Kowalczyk
Everett WA
TurboCNC software --> http://www.dakeng.com
>is
> Thanks for the feedback... there are shortcomings, of that there
> no doubt. But I am trying! ;)there
>
> I'm to the point now where it seems the only real way to have a
> comprehensive handle on things is to have all the messages stored
> somewhere I can get my hands on them.
>
> I might just screen scrape the whole darn thing. Someone said
> may be copyright issues... someone care to explain them?messages
>
> How about if I make a search but never expose my copy of the
> and just refer the person searching to the yahoo post?
>
> Jason
Discussion Thread
Askew, Jason
2002-11-13 07:48:28 UTC
Lazy man's google search of the group...
Marv Frankel
2002-11-13 08:15:01 UTC
Re: [CAD_CAM_EDM_DRO] Lazy man's google search of the group...
turbulatordude
2002-11-13 08:32:32 UTC
Re: Lazy man's google search of the group...
echnidna
2002-11-13 18:57:13 UTC
Re: Lazy man's google search of the group...
killthiskid
2002-11-13 19:13:24 UTC
Re: Lazy man's google search of the group...
echnidna
2002-11-13 22:06:01 UTC
Re: Lazy man's google search of the group...
Dave Kowalczyk
2002-11-13 22:23:44 UTC
Re: Lazy man's google search of the group...
echnidna
2002-11-13 23:35:26 UTC
Re: Lazy man's google search of the group...
turbulatordude
2002-11-14 05:06:46 UTC
Re: Lazy man's google search of the group...
jmkasunich
2002-11-14 05:56:56 UTC
Re: Lazy man's google search of the group...
JJ
2002-11-14 09:21:23 UTC
RE: [CAD_CAM_EDM_DRO] Re: Lazy man's google search of the group...
Raymond Heckert
2002-11-14 19:48:08 UTC
Re: [CAD_CAM_EDM_DRO] Re: Lazy man's google search of the group...
echnidna
2002-11-14 20:19:31 UTC
Re: Lazy man's google search of the group...
Fred Smith
2002-11-14 21:31:23 UTC
Re: Lazy man's google search of the group...
Chris L
2002-11-15 21:30:09 UTC
Re: [CAD_CAM_EDM_DRO] Re: Lazy man's google search of the group...