--- changes per libwhisker release ------------------------------------- [] libwhisker 1.8 - Fixed a bug that would cause Net::SSLeay to segfault/bus error on socket close. Basically have to call Net::SSLeay::free() before normal socket close. - Fixed a SSL timeout bug reported by Pavel Kankovsky and Sullo. - Fixed the socket sharing problem; it will be OK for now, but I still recommend the eventual shift to libwhisker2, which has a new stream/socket implementation that bypasses all these problems. - utils_normalize_uri() pulled from updated libwhisker2 code. - dump() pulled from updated libwhisker2 code; it now returns undef on error. -------------------------------------------------------------------------- [] libwhisker 1.7 - Changes to html_find_tags in order to let it parse XML documents as well. Also a few speed enhancements, but they will most likely be unnoticed during average use. XML reader will be introduced in LW2. - Documentation updates, fixed POD typos, etc. - Multiple Set-Cookie headers were ignored due to a bug in cookie_read. This would only happen if A) {whisker}->{ignore_duplicate_headers} was set to 0 (not default), and B) the server sent multiple cookies. - Changed cookie_read() to use utils_find_lowercase_key(), to be a little more robust regardless of the alpha case of the Set-Cookie header. - You can now have libwhisker bind the socket to a specific port and address (currently for non-SSL sockets only). {whisker}->{bind_port} and {whisker}->{bind_addr} control the values (they default to 14011 and INADDR_ANY, respectively). You need to set {whisker}->{bind_socket}=1 to trigger the binding. - Shortcut in crawl() to skip non absolute http/https URLs. Pointed out by Michael Coulter. - utils_split_uri() got a big makeover, and was made to be RFC-compliant on parsing URLs (parsing order is important). It should handle URIs with no network location info now too (also known as 'net_loc'). It also fixes a bug in which 'http' scheme name comparision was not case-insensitive. - utils_join_uri() also got a makeover. It should now handle non-HTTP URIs given by utils_split_uri(). It also now adds username/password as well. - tweaks to utils_get_dir(), for proper parsing order and speed tweaks. Also removed some dead code. - dump() now correctly saves undefined values (undef). - Documented the ignore_duplicate_headers gotcha that may affect some programs. It's in the KNOWNBUGS file, even though it's not really a bug. - utils_absolute_uri() now strips fragments and parameters from base_uri before combining uris. - Anti-IDS mode 3 was missing a few characters. All better now. -------------------------------------------------------------------------- [] libwhisker 1.6 - I broke crawl() in v1.4 by implementing some changes in utils_absolute_uri(), without reflecting those changes in crawl(). Somehow it managed to survive all of v1.5 without being noticed... - dumper() will now not escape the space character. - Better documented dumper() to indicate structures are saves as references. - Fixed http_do_request_ex() so it doesn't include the "\r" at the end of {http_resp_message}. - Added the get_page_hash() function, which basically just requests the given URL, and returns the resulting whisker response hash. Suggested by jeremiah[at]whitehatsec.com. - cookie_write()'s check of not sending secure cookies of insecure connections was backwards. Apologies to Thomas Reinke (reinke[at]e-softinc.com) For not putting this in v1.5. - cookie_parse() was too greedy when it came to '='. Thanks again to Thomas Reinke. - Moved the {raw_headers_save} code behind the error check in http_do_request(). - Sprinkled more 'defined' checks in http_do_request() in order to stop a tr/// warning reported by Jeremiah Grossman. - utils_recperm did not check to see if an array was defined before using it, causing it to crash. -------------------------------------------------------------------------- [] libwhisker 1.5 - Recording of forms by crawl() did not normalize URL before storing it, making it impossible to find the absolute location of relative form actions. All fixed now. - Crawl() didn't clear out previous result values from %LW::crawl_*, which lead to results carrying over to future scans. Crawl() now clears things out. Reported by Seraphim Down. - Crawl() now saves cookies to %LW::crawl_cookies if the save_cookies crawl config is set. - I broke the perl implementation of md5 when I added the replacement 'use integer' code--and I didn't catch it because my test machine uses the normal MD5 perl module. The problem has to do with me not wrapping the $^H manipulation in a BEGIN{}. Thanks to the many who pointed the problem out, particularly filipe[at]rnl.ist.utl.pt. - Changed http_queue_read() for Net::SSLeay to return errors on SIGPIPE signals. Thanks to filipe[at]rnl.ist.utl.pt. - Changed is_sock_valid() to always return 'no' for SSL sockets; the libraries we currently use do not allow persistent connections, so we will never reuse an SSL connection. Prevents some SIGPIPEs from propogating up into the code. - Inserted SSL session resuming code, which involved a few modifications to http_do_request_ex() internals. Session resuming leads to a speed increase for sequential SSL requests, when using the Net::SSLeay lib. This behavior is on be default, and can be turned off by setting {whisker}->{ssl_resume}=0. - Reworked Net::SSLeay code so that the SSL context (CTX) is reused, rather than reallocated per request. Hopefully this will give a little performance boost, and will be more system resource friendly. - Added support for being able to specify the exact order headers are presented to the server, which is controlled by an anonymous array stuffed into {whisker}->{header_order}. This removes one of the last remaining hurdles that libwhisker imposes on the programmer--coming ever so closer to a 'no rules' paridigm. :) - Fixed perl warning in http_do_request()'s handling of 100-Continue responses. Thanks to jeremiah[at]whitehatsec.com. - auth_set_header() has the optional $domain parameter added. - Added the do_auth() alias for auth_set_header(). do_auth() is easier to remember, and makes much more sense for an API name. Suggested by Seraphim Down. - auth_brute_force() has the optional $domain parameter added. - md5 lib was renamed to mdx, due to the addition of md4 routines (needed for NTLM authentication). This doesn't affect the APIs one bit. - Added encode_unicode() to convert normal strings to unicode strings (basically sticking in NULLs every other character). - The ntlm subpackage was added, which holds NTLM auth routines, and the supporting DES functions. Right now all crypto is implemented in 100% perl, which is slow, but does the job...although benchmarks show my NTLM auth routines to be 300% faster than Authen::NTLM on multiple requests with the same username/password. - http_do_request() was largely modified to accomodate NTLM auth. NTLM (and MD5 too) auth sucks because it requires multiple HTTP requests, which is not jive very well with the single-request-per-http_do_request() call model currently setup. Not to mention the whisker retries feature also throws a curve ball, all the while trying to maintain persistent connections. In the end it works, but http_do_request() exploded in size. Fortunately now the framework is there to eventually add MD5 auth too. - Added utils_text_wrapper() for pretty printing. - forms_parse_callback() incorrecly had $UNKOWNS instead of $UNKNOWNS. - Tweakage to the README, api-demo.pl, and various documentation. Some changes pointed out by: Maximiliano Pérez, Seraphim Down - Added some stuff to the Makefile 'diag' option. - encode_str2ruri() had bugs which kept it from producing the correct encoded URI. This also affected anti-IDS mode 1. Thanks to Benoît Calmels for pointing it out and supplying patched code. - multipart_read() did not pass the correct parameters to multipart_read_data() (the $filepath varb was not proprogated correctly). - http_do_request_ex() tweak to not delete CR/LFs from $resp, in case there's an error and it needs to be written to {whisker}->{data}. - http_do_request_ex() now sets {whisker}->{INITIAL_MAGIC} in responses to 31338, so that it's possible to tell apart requests and reponses by looking at the INITIAL_MAGIC value (31337 for requests, 31338 for responses). - Some changes to the various docs. - Defining/setting {whisker}->{save_raw_headers} in the request causes http_do_request(_ex) to save the raw response and headers into {whisker}->{raw_header_data} in the response hash. -------------------------------------------------------------------------- [] libwhisker 1.4 - Made various changes to support 5.004 Perl versions. Some of the changes came at a price--view the docs/OPTIMIZE text file included with the source distribution in order to learn of some tweaks for your platform in order to gain back a few fractions of a second in speed. :) I'm currently evaluating if 5.003 support is possible... - The md5 function will now abort the entire program if it detects that the md5 result is incorrect (it does an internal self-test upon init). This is to keep funky architectures/perls from generating incorrect md5 checksums (without any warning they are incorrect). Realistically, this is just a worst-case-scenario, and really shouldn't affect anyone. If it does affect you, then drop me a note and we'll get it all fixed up. - Made some changes to utils_recperm, which are not backwards-compatible. However, I'm willing to guess that *no one* used that function, since it was largely undocumented, obscure, and scary looking. If you happened to use that function, then email me and I'll accomodate you in the next release. The change itself involved adding another callback function for testing code responses. - Added the dump subpackage, which provides functions that give basic Data::Dumper functionality. That's one less perl module you need to depend on. - Added utils_save_page() to have an easy way to save the HTML output of a response to a file. - Error check in utils_split_uri(): basically, don't transfer bogus values to given hash if protocol is not HTTP or HTTPS. I would hope that the user would check for such a condition, but you never know...the downside is that you might re-run a prior HTTP request if you try to set a non-HTTP protocol (without checking it first). - Added utils_getopts(), which is a reimplementation of GetOpts::Std. Yet again, one less perl module dependancy. :) - utils_absolute_uri() now can take a third parameter flag which indicates whether or not to pass the output through utils_normalize_uri() before returning it. Things are still backwards compatible with the 2-param ver. I made other changes to it that makes it identical to URI->abs(). - Turns out the 'use integer' pragma is actually an external module (!). This caused the MD5 routines to puke, and thus the entire library to puke, if the proper modules weren't installed. So I recoded the MD5 routines to set integer mode manually (by fiddling with $^H). - Added the utils_unidecode_uri() function, which does basic unicode decoding, catching any overlong characters. - auth_brute_force() return code was changed to give undef on error or no password found...that way you can test for an empty ('') password. - http_do_request[_ex] now copies over {whisker}->{uri} into the response/ hout hash. - crawl_set_config() now can take multiple parameters at once, as in: crawl_set_config(param1=>'val1',param2=>'val2') - crawl()'s handling of %LW::crawl_forms was completely broken. All better now... - crawl() actually does much better now that utils_normalize_uri() has been fixed up. Thanks to Mark.Schaad[at]veritect.com for pointing out some bugs. - Fixed a warning in cookie_read(). -------------------------------------------------------------------------- [] libwhisker 1.3 - Added the {whisker}->{queue_md5} option, which requests the libwhisker to save a hash of the output queue before sending it to the server. - According to ActiveState's BugDB, ActivePerl under Windows does not support non-blocking sockets, thus no non-blocking connects for Windows users. It seems even the IO::Socket module doesn't get nonblock support. I'm glad I found this out, as I was bending over backwards to try to accomodate Windows and non-blocking.... - So what I wound up doing was using the standard alarm()/eval() combo around the blocking connect...hopefully this will still provide a crude way to get some kind of connect() timeout. Windows users are still out of luck, but fortunately the Windows connect() does timeout and not hang infinitely...and the timeout is acceptable (neighborhood of 15-20 seconds). At least, this was the case on my Win2K Pro box w/ ActiveState perl. - Fixed bug in html_find_tags(). Callback was using $start and not $tagstart. Thanks to Jeremiah[at]whitehatsec.com for pointing it out. - Various work on crawl(). @LW::crawl_cookies was changed to %LW::crawl_cookies (@LW::crawl_cookies was never populated, so this change shouldn't affect anyone as the code was essentially unimplemented until this point). Added %LW::crawl_referrers, which will save the referring pages when the save_referrers value is set to 1 (set to 0 by default). - Added the utils_absolute_uri() and utils_normalize_uri() functions. Basically they mangle/parse URLs in various ways to get the correct URL representation. - Added normalizing to crawl(), pointed out by multiple people. Crawl will normalize all found URIs when crawl_config{'normalize_uri'} is set to 1 (which it is by default). - Fixed potential error in sock_valid (used $$Z instead of $$z). Also cleaned up a warning with uninitialized $vin. - Fixed bugs in cookie_parse() which didn't account for whitespace and deleted the path when it was set to '/'. Thanks to cyril.perrault[at]fr.pwcglobal.com for the pointer. - Added lots of binmode() on various sockets/file handles, to keep Windows/DOS from mucking things up. - Slight tweak to Makefile.pl so that LW.pm is assembled the same on various systems (libs added in alphabetic order, vs. whatever random order the readdir() returned them in); otherwise line numbers from warnings/errors don't always correspond to the same code... - anti_ids() incorrectly set {'ids_session_splicing'} instead of {'ids_session_splice'}. All better now. - Updated the docs/whisker_hash.txt file. - {'ssl_save_info'} can only save data from Net::SSLeay; however, it was possible for Net::SSL to reach that point, which would puke. Now the option is ignored if Net::SSL is in use. -------------------------------------------------------------------------- [] libwhisker 1.2 - The return value of inet_aton was not checked; if a DNS entry did not exist, then the library actually then tried to connect to the localhost. All better now. - There was a bug in the non-blocking connect code added in 1.1. Basically, if a connect failed, the library would still try to use the socket. The bug was actually a really subtle/arcane one that involved the eval{} statement--rather than the function returning when an error occur, the eval seemed to trap the return, and thus short-circuit errors back into the pipeline. I removed the eval{}, with the tradeoff that platforms that don't have the fcntl function will error out; however, since the code will only be executed if the Fcntl module is present, I think this is a safe assumption. - Fixed up a few warnings and 'use strict' areas. -------------------------------------------------------------------------- [] libwhisker 1.1 - Added the entire multipart subpackage, which is used for multipart requests, such as those needed to upload files. The syntax is similar to the cookie subpackage (get/set to manipulate values, and read/write to put those values into a request). The multipart_read function works in my test scenarios, but is definately still work-in-progress. Unfortunately the multipart package added over 8K of code to the library size. :/ - Modified utils_split_uri() to take an optional %hin_request hash. If one is supplied then the function will not only split the URI, but also set the appropriate values in the hash. - Added the 'easy' subpackage, which is just a collection of high-level routines to do simple tasks: get_page(), get_page_to_file(), upload_file(), download_file(), etc. - The test.pl script was renamed to api_demo.pl. I also added the simple.pl script, to show off the simple/easy request support added by the easy subpackage. - Added missing POD documentation. - Sprinkled in a little error checking for passed function params. - Added utils_lowercase_hashkeys, which is an alias to utils_lowercase_headers. I didn't want it to seem as if the function was limited to headers... - Added utils_find_lowercase_key(), which will try to find the given key in a supplied hash, regardless of the case of the key or the keys in the hash. - Beefed up the utils_randstr, so that you can specify an optional returned string size, and what characters it's composed of. Still backwards-compatible with prior 1.0 verson. - Added utils_getline and utils_getline_crlf, for grabbing (CR)LF lines out of a piece of data. They internally track position, so you need to reset them initially by calling utils_getline(\$data,0) the first time; after that you just call utils_getline(\$data). The crlf version requires a terminating CRLF sequence; the regular utils_getline just stops at the next LF (\n). - Added MD5 support via new md5 subpackage. Use the md5() command to get a string hex hash of the supplied data. Like the encode_base64 routines, it will attempt to use the faster MD5 Perl module, if available. MD5 support is needed for eventual Digest authentication... - While libwhisker had timeout support for already-established connections, it didn't have timeouts for the initial connect(). So I implemented nonblocking connects on systems that will support it (libwhisker will test for it and use it automatically). Many thanks to Lincoln Stein's "Network Programming with Perl" book for being an excellent reference through all this nonblocking IO cruft. - Fixed a small bug which caused HTML data to not be saved if the HTML data doesn't end in a \n and the server didn't return a Content-Length header. This was due to me using the sock_getline in a loop to retrieve HTML data. - Added the internal-use-only sock_getall, which was used to combat the bug spoken of prior (where HTML data was lost). - Added the forms subpackage, which is the start of routines to parse out HTML forms into something that can be programmically reviewed. Right now there's forms_read() to turn HTML into form hashes, and forms_write() to turn a form hash back into generic HTML. - The html_find_tags function was lowercasing all found tag/element names. This technically alters the data, so it was modified to *not* lowercase the names. It is the responsibility of the callback function to handle any required conversions (including making them lowercase if need be). - Recoded the Makefile to be cross-platform independant, and moved the nopod script functionality into the Makefile. Plus other changes. -------------------------------------------------------------------------- [] libwhisker 1.0 - Simon Cozens was awesome enough to spare some time and give a once-over to libwhisker, pointing out tons of perl-isms and potential changes. We had some discussion about making libwhisker a proper CPAN module--in the end, I believe my goals for libwhisker and the goals of CPAN are different enough to not really warrant that happening. But I did heed a lot of the advice. Details follow... - HUGE CHANGE: the namespace is changed from 'lw' to 'LW', and the library was renamed to LW.pm in order to make the package name and module name the same. It was capitalized since lowercase names passed to 'use' are typically reserved for perl pragmas. Thanks to Simon for pointing this out. I figure if libwww can have LWP, libwhisker can have LW. :) - ANOTHER NOTABLE CHANGE: I added the minimum requirement of perl 5.005. This is because 5.004 and prior don't have some of the functions libwhisker uses (like the 4 argument substr). It may be possible to port the code back to 5.004, but I don't have intensions of doing that right now. - Added {'whisker'}->{'recv_header_order'} to keep track of the exact order of incoming headers. Useful for various testing. - Successfully ran libwhisker on Windows with ActiveState perl (there wasn't really any doubts, but I just wanted to confirm it anyways). 128K requests without a burp...so support on Windows is definately as solid as on unix/linux. SSL doesn't count... - Bug tweaks to html_find_tags. These were found while trying to parse the evil.htm file located in /docs/. Also, I've been using the LW::bin::html_find_tags replacement, which means I haven't been seeing the bugs sitting in the perl version (LW::html_find_tags). So I've cleaned up some problems that may have been affecting crawl(), and both perl and C versions parse the same (correct) results. - Added