1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260
|
2002-03-22 Grub team
* 0.3.0 Release
2002-03-13 Ledio Ago - lajesus
* Modified the way the information and erros are loged in
the logfile. There is a global variable pointing to a log
object which will be seen by all the client modules.
Also the client's loging level is now configurable. The user
of the client will be able to set the loging level through
the config file(grub.conf).
2002-03-06 Ledio Ago - lajesus
* Added a new implementation of archive which now is called
client database(CDB). The CDB is now made of two data
files, one called CrawlerDb.dat and the other ServerDb.dat.
ServerDb.dat file holds the URLs and their information like
CRC, MIME and size, that initially come from the server.
As these URLs are being pulled out from ServerDb.dat they
are being crawled by the crawlers.
CrawlerDb.dat file holds all the information of the crawled
URLs like contents, CRC, MIME, size, status.
CDB supports binary data.
2002-03-05 Kord Campbell - kord
* Moved the Status_Info stuff around so everyone is using it
for their settings and statistics. Removed remote statistics
fetching from the GUI and put it in with the ServerSetting
stuff where it belonged. Added kill -HUP functionality to
the main thread to enable re-reading the config file and/or
statistics. Added some testing for the DOWN detection so that
the crawler won't think it needs to exit when it is crawling a
DOWN host with multiple URLs.
2002-02-27 Bucy Rajan - bucy
* Fixed flicker in GUI and updated some exits in the cord
to do nice pthread exits instead
2002-02-27 Igor Stojanovski - ozra
* Fixed nasty COM internal problems causing the Visual Basic program to
crash; also did some code cleanup
2002-02-18 Kord Campbell - kord
* Saif added constant DOWN detection and proxy functionality.
I changed a few of the help setting stuff and addeded some kill
handling routines. Added using the proxy stuff to the remote
server configuration routines.
2002-02-18 Igor Stojanovski - ozra
* Added the guiface module. This module (stored in guiface/) contains
the code necessary for communication between the crawler base, and
a control front-end (a GUI) for controling the client. The code in
this directory should also compile on Windows. This module depends
on expat XML library (see http://expat.sourceforge.net/), and by
default it does not compile with it. In order to do that, you need
to do './configure --with-expat', after you have installed expat on
your machine. Without expat, guiface is not functional.
2002-01-23 Kord Campbell - kord
* Finished adding in the per run features. Client now is able
to exit once it has crawled a certain number of URLs or a certain
number of bytes. Added in kill handling to cleanly exit the GUI
and to allow the crawler/protocol to finish up before exiting.
2002-01-07 Kord Campbell - kord
* Changed the name of the struct John_Paul to Crawler_Status_Info
and moved its location to util/StatusInterface.h in preperation
of adding additional GUI interface control via the stack stuff
that Ozra is working on for Windows and UNIX. A few changes
were made to the grub.cpp file while Saif works on adding the
per run limit functionality to the code.
2002-01-04 Kord Campbell - kord
* Added additional checking to see if a page has really changed or
not. If a page has a different CRC, but yet its size is about the
same as what it was the last time, we consider it a NOCRAWL, and
possible dynamic page.
2001-12-28 Igor Stojanovski - ozra
* Found and fixed a bug in passing the user agent string to the cURL
class. Made some changes to the namespace additions that Saif made.
2001-12-06 Kord Campbell - kord
* Finished converting the entire project to pthreads. Not as bad
as it seemed at first. The client now compiles nicely on Windows,
which is great. Changed the testCrawl.cpp code around a bit to
allow for entering URLs at the prompt, for testing. Changed the
name of the logfile to grubclient.log.
2001-11-19 Kord Campbell - kord
* Ozra found a HUGE mistake in the ClientDB class, that has been
a source of a lot of client hangs during archiver exceptions. The
problem was an erroneous sscanf() statement after logging an
archiver event. Doh!
2001-11-16 Kord Campbell - kord
* Changed grub_safe script to run new executable
2001-11-09 Kord Campbell - kord
* limited page download sizes to 1 meg, to prevent oom errors
2001-11-08 Kord Campbell - kord
* 0.2.0 Release
* fixed gui display bugs - still needs work
* fixed archiver exception errors from printing to screen
(they still log to the logfile)
* added improved kill handling to gui, and crawler, children
* enabled the shared memory cleanup routines
* finished ranking, client stats info routines
* enabled bandwidth limiting on crawls
* added host protect routines to protect web sites
2001-10-29 Kord Campbell - kord
* 0.2.0-pre Release
2001-10-18 Kord Campbell - kord
* Fixed REDIRECT URL length to 100 characters. Will increase
the value once we modify the database to handle longer URLs.
Currently the Protocol module crashes on put when there are
extreemly long redirect URLs.
2001-10-15 Kord Campbell - kord
* Replaced calling a single Wget for each URL crawled with cURL
routines that continue to crawl new URLs. New child crawlers are
true forks of the software, not exec'd Wgets hacked together to make
the silly thing work. Replaced named pipe child/parent communication
routines with shared memory routines. Performance increase of at
least (2) times has been realized.
2001-08-15 Kord Campbell - kord
* Added in GUI functionality. GUI was written by phil, loki and
redline during Summer of 2001. Removed named pipe communication
layer and replaced it with shared memory routines.
2001-07-17 Igor Stojanovski - ozra
* Removed logging to putlog.log of URLs that have been crawled
2001-06-28 Igor Stojanovski - ozra
* Changed it so that it compiles on gcc v.3.0 thanks to a patch
by Vaclav Barta.
2001-06-25 Igor Stojanovski - ozra
* Changed user agent when doing HTTP requests to web servers from
Wget/VERSION to Grub-Client/VERSION
2001-06-19 Igor Stojanovski - ozra
* 0.1.6 Release
2001-06-14 Lawrence Kincheloe, Phil - Loki, Redline
* Added error checking that uses delay.cpp to ramdomly sleep grub for
a exponentially increasing random amount of time. This function was
used for fixing the crawler from continously trying to connect with
the server when the server was down or out of URL's.
2001-06-14 Igor Stojanovski - ozra
* The child grub process (the one created from the Coordinator), if
PERIODIC_RESTART is defined to integer X, will do X GET/CRAWL/PUT
runs (currently set to 10), and then will exit with status 0; the
parent grub process will then restart using the Coordinator. This
way the OS will clean up any memory/file descriptor leaks.
2001-06-12 Igor Stojanovski - ozra
* When client crashes (that is, catches deadly signal as SIGSEGV)
or exits unexpectedly, this condition will be logged (in crashlog.log
by default), and the crawler will be restarted.
* Kill should finally work well when SIGTERM'ed. If only the client
is SIGTERM'ed, it will be restarted. To shut down the client, you
need to either do 'killall grub', or do 'kill <parent-process-id>'
2001-05-30 Kord Campbell - kord
* Removed some DOS/Win32 includes in archive/archive.cpp to
enable the client to compile nicely on BSD. Updated the
README once again and incremented the version to:
* 0.1.4b Release
2001-05-29 Igor Stojanovski - ozra
* Get rid of the double-spaced printing to the log file
* Alternate FTW file traversal utility was implemented, thanks
to Jesper Juhl. Now systems (like BSD) without it can still
compile.
2001-05-23 Igor Stojanovski - ozra
* Now, client should shutdown properly on kill or killall (SIGTERM)
2001-05-23 Grub team
* 0.1.3 Release
2001-05-22 Igor Stojanovski - ozra
* Made it so that client now does not compile the TThread module. If
you need to compile them for whatever reason, now you need to add
the TThread directory in MAKE_SUBDIRS variable in configure.in, and
make TTHREAD_LIB="../TThread/libtthread.a". We may get rid of this
module altogether eventually. Also, don't forget the -DUSE_THREADS
and -D_REENTRANT flags.
The TThread/ directory is no longer distributed with the tarball.
* Now, by default, the client compiles without threads. To compile
with threads, you need to pass -DUSE_THREADS and -D_REENTRANT flag to
all files.
* modified root Makefile.am so that wget-VERSION/src/wget is copied
to src/. Now you can run the client without having to do
'make install'.
2001-05-21 Igor Stojanovski - ozra
* patched a project config and make files sent to me by Jeff Squyres
(thanks a lot, Jeff). The changes are:
- wget is better integrated with the rest of the project
- C/CXXFLAGS are not overridden by the makefiles
- installation is more consistant
- grub_safe script was added
- a few other things
2001-05-18 Igor Stojanovski - ozra
* Added -D_REENTRANT flag to all compiled files
2001-05-17 Igor Stojanovski - ozra
* configure.in checks if ZLIB is installed on the system. If
not, it will quit, and notify the user.
* The log files don't accumulate infinately any more. After a
maximum is reached, the log will be overwritten with the new
logging data.
2001-05-17 Grub team
* 0.1.2 Release
2001-05-16 Kosta Damevski
* Fixed a bug which seg-faulted the original running client when
second instance was fired up. This was happening because the
new client did file clean-ups before it checked the lock
file and exitted because original was still running (which
effectively messed up orignial client's files and caused the crash).
* Fixed a bug in Verboseprintf() -- a string containing '%' was
either causing a crash, or printing garbage (at best).
2001-05-11 Grub team
* 0.1.1 Release
|