1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018
|
2004-08-01 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.26.
* Modified driver.h to clearly state the GPL license.
This doesn't change anything, but it makes the
Savannah people happy.
2004-07-31 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.25. Changes are:
* Per request from Savannah, added the more detailed licensing
text to every source file.
* Modified the assembly language counting code, based on useful
feedback and a test case from Purnendu Ghosh, so that
the heuristics work better at guessing the right comment character
and they perform well.
In particular, the comment character '*' is far better supported.
* Added support for Delphi project files (.dpr files, which are
essentially in Pascal syntax), thanks to Christian Iversen.
* Some versions of Perl are apparantly causing trouble, but
I have not yet found the solution for them (other than using
a different version of Perl). The troublesome line of code
in break_filelist, which currently says:
open(FH, "-|", "md5sum", $filename) or return undef;
This could be changed to:
open(FH, "-|", "md5sum $filename") or return undef;
But I dare not fix it that way, because that would create
a security problem. Imagine downloading someone
else's source code (who you don't know), using sloccount, and
that other person has created in their source tree a file
named like this: "; rm -fr /*" or its variations.
I'd rather have the program fail in specific circumstances
(users will know when it won't work!) than to insert a known
dangerous security vulnerability. I can't reproduce this problem;
it's my hope that those who CAN will help me find a good
solution. For the moment, I'm documenting the problem here and
in the TODO list, so that people will realize WHY it hasn't
just been "fixed" with the "obvious solution".
The answer: I care about security.
2004-05-10 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.24 - a few minor bugfixes and improvements.
Automatically tries to use several different MD5 programs, until
it finds one that works - this is more flexible, and as a result,
it now works out-of-the-box on Apple Mac OS X.
SLOCCount now accepts "." as the directory to analyze,
it correctly identifies wrapper scripts left by libtool as
automatically generated code, and correctly identifies debian/rules
files as makefiles. Also, installation documentation has improved.
My thanks to Jesus M. Gonzalez-Barahona for telling me about the
Debian bug reports and testing of candidate versions.
My thanks to Koryn Grant, who told me what needed to be done
to get SLOCCount running on Mac OS X (and for testing my change).
This version resolves Debian Bug reports #173699,
#159609, and #200348.
2004-04-27 David A. Wheeler <dwheeler, at, dwheeler.com>
* Automatically try several different MD5 programs, looking for
a working one. Originally this program REQUIRED md5sum.
This new version tried md5sum, then md5, then openssl.
The good news - the program should now 'just work' on
Apple Mac OS X. The bad news - if md5sum doesn't exist,
sloccount still has a good chance of working, but will display
odd error messages while it searches for a working MD5 program.
There doesn't seem to be an easy way in perl to suppress such
messages while still permitting "trouble reading data file"
messages. However, doing the test at run-time is much more
robust, and this way it at least has a chance of working on
systems it didn't work on at all before.
* Removed the "debian" subdirectory. There was no need for it;
it's best for the Debian package maintainers to control that
information on their own.
2004-04-25 David A. Wheeler <dwheeler, at, dwheeler.com>
* Allow "." and ".." as specifications for directories even
when they have no subdirectories.
This resolves Debian bug report log #200348
("Sloccount . fails").
* Correctly identify wrapper scripts left by libtool as
automatically generated code.
When linking against a libtool library, libtool leaves a wrapper
script in the source tree (so that the binary can be executed
in-place, without installing it), which includes this:
(line) # foo - temporary wrapper script for .libs/foo
(line) # Generated by ltmain.sh - GNU libtool 1.4.3
(1.922.2.111 2002/10/23 02:54:36)
I fixed this by saying that any comment beginning with
"Generated by" in the first few lines must be auto-generated
code. This should correctly catch other auto-generated code too.
There is a risk that code NOT automatically generated will be
incorrectly labelled, but that's unlikely.
This resolves Debian Bug report logs #173699,
"sloccount should ignore libtool-generated wrapper scripts".
* Now identifies "debian/rules" files as a makefile.
This resolves Debian Bug report logs - #159609,
"sloccount Does not consider debian/rules to be a makefile".
* Minor fix to sloccount makefile, so that man page installs
correctly in some situations that didn't before.
My thanks to Jesus M. Gonzalez-Barahona.
2003-11-01 David A. Wheeler <dwheeler, at, dwheeler.com>
* Version 2.23 - a few minor bugfixes and improvements.
2003-11-01 David A. Wheeler <dwheeler, at, dwheeler.com>
* Fixed incorrect UTF-8 warnings. Perl 5.8.0 creates warnings
when the LANG value includes ".UTF-8" but the text files read
aren't UTF-8. This causes problems on Red Hat Linux 9 and others,
which set LANG to include ".UTF-8" by default.
This version quietly removes ".UTF-8" from the LANG value for
purposes of sloccount, to eliminate the problem.
2003-11-01 David A. Wheeler <dwheeler, at, dwheeler.com>
* Fixed bad link to "options" in sloccount.html; my thanks to
Barak Zalstein (<Barak.Zalstein, at, ParthusCeva.com) for
telling me.
* Added "--version" option that prints the version number.
Thanks to Auke Jilderda (auke.jilderda, at, philips.com)
for suggesting this.
2003-11-01 Sam Tregar <sam, at, tregar.com>
* Fixed a bug in perl_count that prevents it from
properly skipping POD.
2003-10-30 Julian Squires <julian, at, greyfirst.ca>
* Added simple literate Haskell support.
* Added test cases for literate Haskell support.
* Updated Common LISP and Modula 3 extensions.
2003-03-08 David A. Wheeler <dwheeler, at, dwheeler.com>
* Version 2.22 - improved OCAML support, thanks to Michal Moskal.
Other minor improvements.
2003-02-15 Jay A. St. Pierre
* Fixed uninstalling documents to always remove DOC_DIR.
2003-02-15 Michal Moskal
* Significantly improved OCAML support - complete rewrite of
ML handling.
2003-01-28 David A. Wheeler <dwheeler, at, dwheeler.com>
* Version 2.21 - improved Fortran support (inc. Fortran 90);
my thanks to Erik Schnetter for implementing this!
2002-12-17 Erik Schnetter <schnetter, at, uni-tuebingen.de>
* Added support for Fortran 90. Extensions are ".f90" and ".F90".
* Changed handling of Fortran 77 to include HPF and Open MP
statements, and to accept uppercase ".F77" as extension.
2002-12-04 David A. Wheeler <dwheeler, at, dwheeler.com>
* Version 2.20 - minor portability and documentation improvements.
* Documentation improvements - more discussion on Intermediate COCOMO.
2002-12-04 Linh Luong <Linh.Luong, at, trw.com>
* Modified SLOCCount so that it would run on Solaris 2.7
(once Perl is installed and the PATH is set correctly to include
the directory where SLOCCount is installed).
This required modifying file sloccount to eliminate the
test ("[") option "-e", replacing it with the "-r" option
("test -e" is apparantly not supported by Solaris 2.7).
Since "-r" should be available on any implementation of "test",
this is a nice portable change.
2002-11-16 David A. Wheeler <dwheeler, at, dwheeler.com>
* Version 2.19, documentation improvement.
* Documented the "Improved COCOMO" model from Boehm,
so that users who want more accurate estimates can do at
least a little bit straight from the documentation.
For more, as always, see Boehm's book.
If anyone wants to implement logical SLOC counting, please be
my guest! Then, COCOMO II could be implemented too.
* Modified this ChangeLog to document more fully the SGI MIPS problem.
2002-11-16 David A. Wheeler <dwheeler, at, dwheeler.com>
* Version 2.18, minor bugfix release.
* Updated the "wc -l" check; it would cause problems for users
who had never used sloccount before (because datadir had not
been created yet). Also, the "wc -l" check itself would not
reliably identify SGI systems that had horribly buggy "wc"
programs; it's believed this is a better check.
Thanks to Randal P. Andress for helping with this.
* Fixed this ChangeLog. It was Randal P. Andress who identified
the "wc -l" bug, not Bob Brown. Sorry for the misattribution,
and thanks for the bugfixing help!
* Changed rpm building command to work with rpm version 4
(as shipped with Red Hat Linux 8.0). As of Red Hat Linux 8,
the "rpm" command only loads files, while there is now a
separate "rpmbuild" command for creating rpm files.
Those rebuilding with Red Hat Linux 7.X or less (rpm < version 4)
will need to edit the makefile slightly, as documented
in the makefile, to modify the variable RPMBUILD.
* "make rpm" now automatically uninstalls sloccount first if it can,
to eliminate unnecessary errors when building new versions of
sloccount RPMs. This only affects people modifying and
redistributing the code of sloccount (mainly, me).
2002-11-16 Randal P. Andress
* Fixed get_sloc so that it
also accepts --filecounts as well as --filecount.
2002-11-05 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.17, which adds support for Java Server Pages
(.jsp), eliminates some warnings in newer Perl implementations,
and has a few minor fixes/improvments.
2002-11-18 Randal P. Andress
* Randal provided the following additional information about this
really nasty problem on SGI MIPS machines. It causes gcc
to not work properly, and thus "wc" won't work properly either.
SLOCCount now detects that there's a problem and will refuse to
run if things are screwed up this badly. For those unfortunate
few who have to deal with this case, here's additional information
from Randal Andress:
When gcc is installed on SGI MIPS from source, sgi-mips-sgi-irix6.x,
an option specification in the 'specs' file is set
incorrectly for n32. The offending line is:
%{!mno-long64:-D__LONG_MAX__=9223372036854775807LL}
Which (unless option '-mno-long64' is specified), means that
LONG_MAX is 64 bits. The trouble is two fold:
1. This should not be the default, since for n32,
normally, long is only 32 bits. and
2. The option did not carry into the
compiler past the pre-processor - so it did not work.
The simplest fix for gcc (it seems that it can be done locally by
editing the specs file) is to have the following line to
replace the offending line in the specs file:
%{long64:-D__LONG_MAX__=9223372036854775807LL}
This makes the default 32 and only sets it to 64 if you specify
'-long64' which *does* work all the way through the compiler.
I had the binary for gcc 3 on the sgi freeware site installed here and
looked at it's specs file and found no problem (they have the '-long64'
option). So it seems that when they build gcc for their freeware
distribution, they fix it.
The problem comes when someone downloads and builds gcc for themselves
on sgi. Then the installation is faulty and any n32 code that they
build is subject to this flaw if the source makes use of LONG_MAX
or any of the values derived from it.
The real problem turned out to be quite general for sgi n32 gcc. The
'specs' file and mips.h are not consistent resulting in 'LONG_MAX'
being given an incorrect value.
The following 'c' program shows inconsistent values for macros for
mips-irix n32:
__LONG_MAX__ (LONG_MAX) and
_MIPS_SZLONG
This seems to stem from an improper default option in the specs file
forcing -D__LONG_MAX__=0x7fffffffffffffff
to be passed to each compile.
Here is the test case, compile command, and output:
# include <limits.h>
#define LONG_MAX_32_BITS 2147483647
#include <sys/types.h>
int main () {
#if LONG_MAX <= LONG_MAX_32_BITS
printf ("LONG_MAX <= LONG_MAX_32_BITS = 0x%lx\n",LONG_MAX);
#else
printf ("LONG_MAX > LONG_MAX_32_BITS = 0x%llx\n",LONG_MAX);
#endif
printf ("_MIPS_SZLONG = 0x%x\n",_MIPS_SZLONG);
printf ("__LONG_MAX__ = 0x%llx (size:%d)\n",__LONG_MAX__,
sizeof
(__LONG_MAX__));
#if LONG_MAX <= LONG_MAX_32_BITS
printf ("LONG_MAX = 0x%lx (size:%d)
\n",LONG_MAX,sizeof(LONG_MAX));
#else
printf ("LONG_MAX = 0x%llx (size:%d)
\n",LONG_MAX,sizeof(LONG_MAX));
#endif
printf ("LONG_MAX_32_BITS = 0x%x (size:%d)
\n",LONG_MAX_32_BITS,sizeof(LONG_MAX_32_BITS));
return 0;
}
============ end test case source.
>gcc -n32 -v -o test_limits -O0 -v -g test_limits.c
defines include:....-D__LONG_MAX__=9223372036854775807LL....
=========== test output:
>test_limits
LONG_MAX > LONG_MAX_32_BITS = 0x7fffffffffffffff
_MIPS_SZLONG = 0x20
__LONG_MAX__ = 0x7fffffffffffffff (size:8)
LONG_MAX = 0x7fffffffffffffff (size:8)
LONG_MAX_32_BITS = 0x7fffffff (size:4)
======== end test case output
By changing the specs entry:
%{!mno-long64:-D__LONG_MAX__=9223372036854775807LL}
to
%{long64:-D__LONG_MAX__=9223372036854775807LL}
as is discussed in one of the internet reports I sent earlier, the
output,
after recompiling and running is:
LONG_MAX <= LONG_MAX_32_BITS = 0x7fffffff
_MIPS_SZLONG = 0x20
__LONG_MAX__ = 0x7fffffff (size:4)
LONG_MAX = 0x7fffffff (size:4)
LONG_MAX_32_BITS = 0x7fffffff (size:4)
Although I have not studied it well enough to know exactly why, the
problem has to do with the size of (long int) and the attempt of the
'memchr' code to determine whether or not it can use 64 bit words
rather than 32 bit words in chunking through the string looking
for the specified character, "\n"(0x0a) in the case of 'wc'.
2002-11-03 David A. Wheeler <dwheeler, at, dwheeler.com>
* Fixed makefile install/uninstall scripts to properly handle
documentation.
* Added simple check at beginning of sloccount's execution
to make sure "wc -l" actually works.
Randal P. Andress has found that on certain SGI machines, "wc -l"
produces the wrong answers. He reports,
"You may already know this, but just in case you do not, there is an
apparent bug in textutils-1.19 function 'wc' (at least as built on
SGI-n32) which is caused by an apparent bug in memchr (*s, c, n).
The bug is only evident when counting 'lines only' or
'lines and characters' (i.e., when NOT counting words).
The result is that the filecount is short...
I replaced the memchr with very simple code and it corrected the
problem. I then installed textutils-2.1 which does not seem have
the problem."
I thought about adding this information just to the documentation,
but no one would notice it. By adding a check to the code,
most people will neither know nor care about the problem, and
the few people it DOES affect will know about the problem
right away (instead of reporting wrong answers).
Yes, a failing "wc -l" is a pretty horrific bug, but rather
than ignore the problem, it's better to detect and address it.
* Modified documentation everywhere so that it consistently
documents "--filecount" as the correct option for filecounts,
not "--filecounts". That way, the documentation is consistent.
* However, in an effort to "do the right thing", the program sloccount
will accept "--filecounts" as an alternative way to specify
--filecount.
2002-11-02 Bob Brown <rlb, at, bluemartini.com>
* Contributed code changes to count Java Server Page (.jsp) files.
The code does not pull comments out of embedded
javascript. We don't consider that a serious limitation at all,
since no one should be sending embedded javascript comments
to client browsers anyhow. They're extremely rare.
David A. Wheeler notes that you could
argue that if you _DO_ include such comments, they're
not really functioning as comments (since they DO have an
affect on the result - they're more like print statements in an
older language instead of a traditional language's comments).
2002-11-02 David A. Wheeler <dwheeler, at, dwheeler.com>
* Eliminated more Perl warnings by adding more
defined() wrappers to while() loops in Perl code
(based on Randal's suggestion). The problem is that Perl
handles the last line of a file oddly if it doesn't end with
a newline indicator, and it consists solely of "0".
2002-11-02 Randal P Andress <Randal_P_Andress, at, raytheon.com>
* Eliminated some Perl warnings by adding
defined() wrappers to while() loops in Perl code.
2002-8-24 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.16, fixed limitations of old Pascal counter.
2002-8-24 David A. Wheeler <dwheeler, at, dwheeler.com>
* Re-implemented Pascal counter (in flex). This fixes some problems
the old counter had - it handles nested comments with different
formats, and strings as well.
* Removed the BUGS information that described the Pascal counter
weaknesses.. since now they're gone!
* Added an additional detector of automatically generated files -
it's an auto-generated file if it starts with
"A lexical scanner generated by flex", since flex adds this.
Generally, this isn't a problem, since we already detect
the filename and matching .c files, but it seems worth doing.
2002-8-22 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.15, a bugfix + small feature improvement.
My sincere thanks to Jesus M. Gonzalez-Barahona, who provided
patches with lots of useful improvements.
2002-8-22 Jesus M. Gonzalez-Barahona
* Added support for Standard ML (as language "ml").
* A patch suggested to the Debian BTS; .hh is also a C++ extension.
* Some ".inc" files are actually Pascal, not PHP;
now ".inc" files are examined binned to either Pascal or PHP
depending on their content.
* Improved detection of Pascal files (particularly for Debian
package fpc-1.0.4).
* php_count was not closing open files before opening a new one,
and therefore sloccount could fail to count PHP code given
a VERY LONG list of PHP files in one package.
* break_filelist had problems with files including <CR> and other
weird characters at the end of the filename. Now fixed.
2002-7-24 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.14. Improved Pascal detection, improved
Pascal counting, added a reference to CCCC.
2002-7-24 David A. Wheeler <dwheeler, at, dwheeler.com>
* Modified Pascal counting; the older (*..*) commenting structure
is now supported. Note that the Pascal counter is still imperfect;
it doesn't handle the prioritization between these two commenting
systems, and can be fooled by strings that include a
comment start indicator. Rewrites welcome, however, for most
people the current code is sufficient. This really needs to be
rewritten in flex; languages with strings and multiline comment
structures aren't handled correctly with naive Perl code.
* Documented the weaknesses in the Pascal counter as BUGS.
2002-7-24 Ian West IWest, at, aethersystems, dot com
* Improved heuristic for detecting Pascal programs in break_filelist.
Sloccount will now categorize files as Pascal if they have
the file type ".pas" as well as ".p", though it still checks
the contents to make sure it's really pascal.
The heuristic was modified so that it's also considered Pascal
if it contains "module" and "end.",
or "program", "begin", and "end." in addition to the existing cases.
(Ian West used sloccount to analyze a system containing
about 1.2 million lines of code in almost 10,000 files;
ninety percent of it is Ada, and the bulk of the remainder
is split between Pascal and SQL. The following is Ian's
more detailed explanation for the change):
VAX Pascal uses "module" instead of "program" for files that
have no program block and therefore no "begin".
There is also no requirement for a Pascal file to have
procedures or functions, which is the case for files that are
equivalents of C headers. So I modified the function to
allow files to be accepted that only contain either:
"module" and "end."; or "program", "begin", and "end.".
I considered adding checks for "const", "type", and "var" but
decided they were not necessary. I have added the extra cases
without changing the existing logic so as not to upset
any cases for "unit". It is possible to optimize the logic
somewhat, but I felt clarity was better than efficiency.
I found that some of my Pascal files were getting through
only because the word "unit" appeared in certain comments.
So I moved the line for filtering out comments above the lines
that look for the keywords.
Pascal in general allows comments in the form (*...*) as well
as {...}, so I added a line to remove these.
After making these changes, all my files were correctly
categorized. I also verified that the sample Pascal files
from p2c still had the same counts.
Thank you for developing SLOCCount. It is a very useful tool.
2002-7-15 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added a reference to CCCC; http://cccc.sourceforge.net/
2002-5-31 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.13.
* Code cleanups. Turned on gcc warnings ("-Wall" option) and
cleaned up all code that set off a warning.
This should make the code more portable as well as cleaner.
Made a minor speed optimization on an error branch.
2002-3-30 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.12.
* Added a "testcode" directory with some sample source code
files for testing. It's small now, but growth is expected.
Contributions for this test directory (especially for
edge/oddball cases) are welcome.
2002-3-25 David A. Wheeler <dwheeler, at, dwheeler.com>
* Changed first-line recognizers so that the first line (#!) will
matched ignoring case. For most Unix/Linux systems uppercase
script statements won't work, but Windows users.
* Now recognize SpeedyCGI, a persistent CGI interface for Perl.
SpeedyCGI has most of the speed advantages of FastCGI, but
has the security advantages of CGI and has the CGI interface
(from the application writer's point of view).
SpeedyCGI perl scripts have #!/usr/bin/speedy lines instead of
#!/usr/bin/perl. More information about SpeedyCGI
can be found at http://daemoninc.com/speedycgi/
Thanks to Priyadi Iman Nurcahyo for noticing this.
2002-3-15 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added filter to remove calls to sudo, so
"#!/usr/bin/sudo /usr/bin/python" etc as the first line
are correctly identified.
2002-3-7 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added cross-references to LOCC and CodeCount. They don't
do what I want.. which is why I wrote my own! .. but others
may find them useful.
2002-2-28 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.11.
* Added support for C#. Any ".cs" file is presumed
to be a C# file. The C SLOC counter is used to count SLOC.
Note that C# doesn't have a "header" type (Java doesn't either),
so disambiguating headers isn't needed.
* Added support for regular Haskell source files (.hs).
Their syntax is sufficiently similar that just the regular
C SLOC counter works.
Note that literate Haskell files (.lhs) are _not_ supported,
so be sure to process .lhs files into .hs files before counting.
There are two different .lhs conventions; for more info, see:
http://www.haskell.org/onlinereport/literate.html
* Tweaked COBOL counter slightly. Added support in fixed (default)
format for "*" and "/" as comment markers in column 1.
* Modified list of file extensions known not to be source code,
based on suffixes(7). This speeds things very slightly, but the
main goal is to make the "unknown" list smaller.
That way, it's much easier to see if many source code files
were incorectly ignored. In particular, compressed formats
(e.g., ".tgz") and multimedia formats (".wav") were added.
* Modified documentation to make things clear: If you want source
in a compressed file to be counted (e.g. .zip, .tar, .tgz),
you need to uncompress the file first!!
* Modified documentation to clarify that literate programming
files must be expanded first.
* Now recognize ".ph" as Perl (it's "Perl header" code).
Please let me know if this creates many false positives
(i.e., if there are programs using ".ph" in other ways).
* File count_unknown_ext modified slightly so that it now examines
~/.slocdata. Modified documentation so that its use is
recommended and explained. It's been there for a while, but
with poor documentation I bet few understand its value.
* Modified output to clearly say that it's Open Source Software /
Free Software, licensed under the GPL. It was already stated
that way in the documentation and code, but clearly stating this
on every run makes it even harder to miss.
2002-2-27 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.10.
* COBOL support added! Now ".cbl" and ".cob" are recognized
as COBOL extensions, as well as their uppercase ".CBL" and ".COB".
The COBOL counter works as follows:
it detects if a "freeform" command has been given. Unless a
freeform command's given, a comment has "*" or "/" in column 7,
and a SLOC is a non-comment line with
at least one non-whitespace in column 8 or later (including
columns 72 or greater; it's arguable if a line that's empty
before column 72 is really a line or a comment, but I've decided
to count such odd things as lines).
If we've gone free-format, a comment is a line that has optional
whitespace and then "*".. otherwise, a line with nonwhitespace
is a SLOC.
Is this good enough? I think so, but I'm not a major COBOL user.
Feedback from real COBOL users would be welcome.
A source for COBOL test programs is:
http://www.csis.ul.ie/cobol/examples/default.htm
Information on COBOL syntax gathered from various locations, inc.:
http://cs.hofstra.edu/~vmaffea1/cobol.html
http://support.merant.com/websupport/docs/microfocus/books/
nx31books/lrintr.htm
* Modified handling of uppercase filename extensions so they'll
be recognized as well as the more typicaly lowercase extensions.
If a file has one or more uppercase letters - and NO
lowercase letters - it's assumed that it may be a refugee from
an old OS that supported only uppercase filenames.
In that circumstance, if the filename extension doesn't match the
set of known extensions, it's made into lowercase and recompared
against the set of extensions for source code files.
This heuristic should improve recognition of source
file types for "old" programs using upper-case-only characters.
I do have concern that this may be "too greedy" an algorithm, i.e.,
it might claim that some files that aren't really source code
are now source code. I don't think it will be a problem, though;
many people create filename
extensions that only differ by case in most circumstances; the
".c" vs. ".C" thing is an exception, and since Windows folds
case it's not a very portable practice. This is a pretty
conservative heuristic; I found Cobol programs with lowercase
filenames and uppercase extensions ("x.CBL"), which wouldn't
be matched by this heuristic. For Cobol and Fortran I put in
special ".F", ".CBL", and ".COB" patterns to catch them.
With those two actions, the program should manage to
correctly identify more source files without incorrectly
matching non-source files.
* ".f77" is now also accepted as a Fortran77 extension.
Thanks to http://www.webopedia.com/quick_ref/fileextensionsfull.html
which has lots of extension information.
* Fixed a bug in handling top-level directories where there were NO
source files at all; in certain cases this would create
spurious error messages. (Fix in compute_all).
2002-1-7 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.09.
2002-1-9 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added support for the Ruby programming language, thanks to
patches from Josef Spillner.
* Documentation change: added more discussion about COCOMO,
in particular why its cost estimates appeared so large.
Some programmers think of just the coding part, and only what
they'd get paid directly.. but that's less than 10% of the
costs.
2002-1-7 David A. Wheeler <dwheeler, at, dwheeler.com>
* Minor documentation fix - the example for --effort in
sloccount.html wasn't quite right (the base documentation
for --effort was right, it was just the example that was wrong).
My thanks to Kevin the Blue for pointing this out.
2002-1-3 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.08.
2002-1-3 David A. Wheeler <dwheeler, at, dwheeler.com>
* Based on suggestions by Greg Sjaardema <gdsjaar@sandia.gov>:
* Modified c_count.c, function count_file to close the stream
after the file is analyzed. Otherwise, this can cause problems
with too many open files on some systems, particularly on
operating systems with small limits (e.g., Solaris).
* Added '.F' as a Fortran extension.
2002-1-2 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 2.07.
2002-1-2 Vaclav Slavik <vaclav.slavik@matfyz.cz>
* Modified the RPM .spec file in the following ways:
* By default the RPM package now installs into /usr (so binaries
go into /usr/bin). Note that those who use the makefile directly
("make install"), including tarball users,
will still default to /usr/local instead.
You can still make the RPM install to /usr/local by using
the prefix option, e.g.:
rpm -Uvh --prefix=/usr/local sloccount*.rpm
* Made it use %{_prefix} variable, i.e. changing it to install
in /usr/local or /usr is a matter of changing one line
* Use wildcards in %files section, so that you don't have to modify
the specfile when you add new executable
* Mods to make it possible to build the RPM as non-root (i.e.
BuildRoot support, %defattr in %files, PREFIX passed to make install)
2002-1-2 Jesus M. Gonzalez Barahona <jgb@debian.org>
* Added support for Modula-3 (.m3, .i3).
* ".sc" files are counted as Lisp.
* Modified sloccount to handle EVEN LARGER systems (i.e.,
so sloccount will scale even more).
In a few cases, parameters were passed on the command line
and large systems could be so large that the command line was
too long. E.G., Debian GNU/Linux. This caused a large number
of changes to different files to remove these scaleability
limitations.
* All *_count programs now accept "-f filename" and "-f -" options,
where 'filename' is a file with a list of filenames to count.
Internally the "-f" option with a filename is always used, so
that an arbitrarily long list of files can be measured and so
that "ps" will show more status information.
* compute_sloc_lang modified accordingly.
* get_sloc now has a "--stdin" option.
* Some small fixes here and there.
* This closes Debian bug #126503.
2001-12-28 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released sloccount 2.06.
2001-12-27 David A. Wheeler <dwheeler, at, dwheeler.com>
* Fixed a minor bug in break_filelist, which caused
(in extremely unusual circumstances) a problem when
disambiguating C from C++ files in complicated situations
where this difference was hard to tell. The symptom: When
analyzing some packages (for instance, afterstep-1.6.10 as
packaged in Debian 2.2) you would get the following error:
Use of uninitialized value in pattern match (m//) at
/usr/bin/break_filelist line 962.
This could only happen after many other disambiguating rules
failed to determine if a file was C or C++ code, so the problem
was quite rare.
My thanks to Jesus M. Gonzalez-Barahona (in
Mostoles, Spain) for the patch that fixes this problem.
* Modified man page, explaining the problems of filenames with
newlines, and also noting the problems with directories
beginning with "-" (they might be confused as options).
* Minor improvements to Changelog text, so that the
changes over time were documented more clearly.
* Note that CEPIS "Upgrade" includes a paper that depends
on sloccount. This is "Counting Potatoes: the Size of Debian 2.2"
which counts the size of Debian 2.2 (instead of Red Hat Linux,
which is what I counted). The original release is at:
<http://www.upgrade-cepis.org/issues/2001/6/upgrade-vII-6.html>.
I understand that they'll make some tweaks and
release a revision of the paper on the Debian website.
It's interesting; Debian 2.2 (released in 2000, and
which did NOT have KDE), has 56 million physical SLOC and
would have cost $1.8 billion USD to develop traditionally.
That's more than Red Hat; see <http://www.dwheeler.com/sloc>.
Top languages: C (71.12%), C++ (9.79%), LISP, Shell, Perl,
Fotran, Tcl, Objective-C, Assembler, Ada, and Python in that
order. My thanks to the authors!
2001-10-25 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released sloccount 2.05.
* Added support for detecting and counting PHP code.
This was slightly tricky, because PHP's syntax has a few "gotchas"
like "here document" strings, closing working even in C++ or sh
style comments, and so on.
Note - HTML files (.html, .htm, etc) are not examined for PHP code.
You really shouldn't put a lot of PHP code in HTML documents, because
it's a maintenance problem later anyway.
The tool assigns every file a single type.. which is a problem,
because HTML files could have multiple simultaneous embedded types
(PHP, javascript, and HTML text). If the tool was modified to
assign multiple languages to a single file, I'm not sure how
to handle the file counts (counts of files for each language).
For the moment, I just assign HTML to "html".
* Modified output so that it adds a header before the language list.
2001-10-23 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released sloccount 2.01 - a minor modification to support
Cygwin users.
* Modified compute_all to make it more portable (== became =);
in particular this should help users using Cygwin.
* Modified documentation to note that, if you install Cygwin,
you HAVE to use Unix newlines (not DOS newlines) for the Cygwin
install. Thanks to Mark Ericson for the bug report & for helping
me track that down.
* Minor cleanups to the ChangeLog.
2001-08-26 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released sloccount 2.0 - it's getting a new version number because
its internal data format changed. You'll have to re-analyze
your system for the new sloccount to work.
* Improved the heuristics to identify files (esp. .h files)
as C, C++, or objective-C. The code now recognizes
".H" (as well as ".h") as header files.
The code realizes that ".cpp" files that begin with .\"
or ,\" aren't really C++ files - XFree86 stores many
man pages with these extensions (ugh).
* Added the ability to "--append" analyses.
This means that you can analyze some projects, and then
repeatedly add new projects. sloccount even stores and
recovers md5 checksums, so it even detects duplicates
across the projects (the "first" project gets the duplicate).
* Added the ability to mark a data directory so that it's not
erased (just create a file named "sloc_noerase" in the
data directory). From then on, sloccount won't erase it until
you remove the file.
* Many changes made aren't user-visible.
Completely re-organized break_filelist, which was getting
incredibly baroque. I've improved the sloccount code
so that adding new languages is much simpler; before, it
required a number of changes in different places, which was bad.
* SLOCCount now creates far fewer files, which is important for
analyzing big systems (I was starting to run out of inodes when
analyzing entire GNU/Linux distributions).
Previous versions created stub files in every child directory
for every possible language, even those that weren't used;
since most projects only use a few languages, this was costly in
terms of inodes. Also, the totals for each language for a given
child directory are now in a single file (all-physical.sloc)
instead of being in separate files; this not only reduces inode
counts, but it also greatly simplifies later processing & eliminated
a bug (now, to process all physical SLOC counts in a given child
directory, just process that one file).
2001-06-22 David A. Wheeler <dwheeler, at, dwheeler.com>
* Per Prabhu Ramachandran's suggestion, recognize ".H" files as
".h"/".hpp" files (note the upper case).
2001-06-20 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 1.9. This eliminates installation errors
with "sql_count" and "makefile_count",
detects PostgreSQL embedded C (in addition to Oracle and Informix),
improves detection of Pascal code, and includes support for
analyzing licenses (if a directory has the file PROGRAM_LICENSE,
the file's contents are assumed to have the license name for that
top-level program). It eliminates a portability problem, so
hopefully it'll be easier to run it on Unix-like systems.
It _still_ requires the "md5sum" program to run.
2001-06-14 David A. Wheeler <dwheeler, at, dwheeler.com>
* Changed the logic in make_filelists.
This version doesn't require a "-L" option to test which GNU
programs supported but which others (e.g., Solaris) didn't.
It still doesn't normally follow symlinks.
Not following subordinate symlinks is important for
handling oddities such as pine's build directory
/usr/src/redhat/BUILD/pine4.33/ldap in Red Hat 7.1, which
includes symlinks to directories not actually inside the
package at all (/usr/include and /usr/lib).
* Added display of licenses in the summary form, if license
information is available.
* Added undocumented programs rpm_unpacker and extract_license.
These are not installed at this time, they're just provided as
a useful starting point if someone wants them.
2001-06-12 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added support for license counting. If the top directory
of a program has a file named "PROGRAM_LICENSE", it's copied to
the .slocdata entry, and it's reported as part of a licensing total.
Note that the file LICENSE is ignored, that's often more complex.
2001-06-08 David A. Wheeler <dwheeler, at, dwheeler.com>
* Fixed RPM spec file - it accidentally didn't install
makefile_count and sql_count. This would produce spurious
errors and inhibited the option of counting makefiles and SQL.
Also fixed the makefile to include sql_count in the executable list.
2001-05-16 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added support for auto-detecting ".pgc" files, which are
embedded PostgreSQL - they are assumed to be C files (they COULD
be C++ instead; while this will affect categorization it
won't affect final SLOC counts). Also, if there's a ".c" with
a corresponding ".pgc" file, the ".c" file is assumed to be
auto-generated.
* Thus, SLOCCount now supports embedded database commands for
Oracle, Informix, and PostgreSQL. MySQL doesn't use an
"embedded" approach, but uses a library approach that SLOCCount
could already handle.
* Fixed documentation: HTML reserved characters misused,
sql_count undocumented.
2001-05-14 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added modifications from Gordon Hart to improve detection
of Pascal source code files.
Pascal files which only have a "unit" in them (not a full program),
or have "interface" or "implementation",
are now detected as Pascal programs.
The original Pascal specification didn't support units, but
there are Pascal programs which use them. This should result in
more accurate counts of Pascal software that uses units.
He also reminded me that Pascal is case-insensitive, spurring a
modification in the detection routines (for those who insist on
uppercase keywords.. a truly UGLY format, but we need to
support it to correctly identify such source code as Pascal).
* Modified the documentation to note that I prefer unified diffs.
I also added a reference to the TODO file, and from here on
I'll post the TODO file separately on my web site.
2001-05-02 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 1.8. Added several features to support
measuring programs with embedded database commands.
This includes suporting many Oracle & Informix embedded file types
(.pc, .pcc, .pad, .ec, .ecp). It also optionally counts
SQL files (.sql) and makefiles (makefile, Makefile, etc.),
though by default they are NOT included in lines-of-code counts.
See the (new) TODO file for limitations on makefile identification.
2001-04-30 David A. Wheeler <dwheeler, at, dwheeler.com>
* Per suggestion from Gary Myer, added optional "--addlang" option
to add languages not NORMALLY counted. Currently it only
supports "makefile" and "sql". The scheme for detecting
automatically generated makefiles could use improvement.
Normally, makefiles and sql won't be counted in the final reports,
but the front-end will make the calculations and if requested their
values will be provided.
* Added an "SQL" counter and a "makefile" counter.
* Per suggestions from Gary Myer, added detection for files where
database commands (Oracle and Informix) are embedded in the code:
.pc -> Oracle Preprocessed C code
.pcc -> Oracle preprocessed C++ Code
.pad -> Oracle preprocessed Ada Code
.ec -> Informix preprocessed C code
.ecp -> Informix preprocessed C code which calls the C preprocessor
before calling the Informix preprocessor.
Handling ".pc" has heuristics, since many use ".pc" to mean
"stuff about PCs". Certain filenames not counted as C files (e.g.,
"makefile.pc" and "README.pc") if they end in ".pc".
Note that if you stick C++ code into .pc files, it's counted as C.
These embedded files are normal source files of the respective
language, with database commands stuck into them, e.g.,
EXEC SQL select FIELD into :variable from TABLE;
which performs a select statement and puts the result into the
variable. The database preprocessor simply reads this file,
and converts all "EXEC SQL" statements into the appropriate calls
and outputs a normal program.
Currently the "automatically generated" detectors don't detect
this case. For the moment, just make sure the generated files
aren't around while running SLOCCount.
Currently the following are not handled (future release?):
.pco -> Oracle preprocessed Cobol Code
.pfo -> Oracle preprocessed Fortran Code
I don't have a Cobol counter. The Fortran counter only works
for f77, and I doubt .pfo is limited to that.
2001-04-27 David A. Wheeler <dwheeler, at, dwheeler.com>
* Per suggestions from Gary Myer,
added ".a" and ".so" to the "not" list, since these are
libraries not source, and added the filename "Root" to the
"not" file list ("Root" has special meaning to CVS).
* Added a note about needing "md5sum" (Gary Myer)
* Added a TODO file. If something's on the TODO list that you'd
like, please write the code and send it in.
* Noted that running on Cygwin is MUCH slower than when running
on Linux. Truth in advertizing is only fair.
2001-04-26 David A. Wheeler <dwheeler, at, dwheeler.com>
* Release version 1.6: the big change is support for running on
Windows. Windows users must install Cygwin first.
* Modified makefile so that SLOCCount can run on Windows systems
if "Cygwin" is installed. The basic modifications to do this
were developed by John Clezy -- Thanks!!! I spent time merging
his makefile and mine so that a single makefile could be used on
both Windows and Unix.
* Documented how to install and run SLOCCount on Windows using cygwin.
* Changed default prefix to /usr/local; you can set PREFIX to
change this, e.g., "make PREFIX=/usr".
* When counting a single project, sloccount now also reports
"Estimated average number of developers", which is simply
the person-months divided by months. As with all estimates, take
it with an ocean of salt. This isn't reported for multiproject
queries; properly doing this would require "packing" to compensate
for the fact that small projects complete before large ones if
started simultaneously.
* Improved man page (fixed a typo, etc.).
2001-01-10 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released version 1.4. This is an "ease of use" release,
greatly simplifying the installation and use of SLOCCount.
The new front-end tool "sloccount" does all the work in one step -
now just type "sloccount DIRECTORY" and it's all counted.
An RPM makes installation trivial for RPM-based systems.
A man page is now available. There are now rules for
"make install" and "make uninstall" too.
Other improvements include a schedule estimator and options
to control the effort and schedule estimators.
2001-01-07 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added an estimator of schedule as well as effort.
* Added various options to control the effort and
cost estimation: "--effort", "--personcost", "--overhead",
and "--schedule".
Now people can (through options) control the assumptions made
in the effort and cost estimations from the command line.
The output now shows the effort estimation model used.
* Changed the output slightly to pretty it up and note that
it's development EFFORT not TIME that is shown.
* Added a note at bottom asking for credit. I don't ask for any
money, but I'd like some credit if you refer to the data the
tool generates; a gentle reminder in the output seemed like the
easiest way to ask for this credit.
* Created an RPM package; now RPM-based systems can EASILY
install it. It's a relocatable package, so hopefully
"alien" can easily translate it to other formats
(such as Debian's .deb format).
* Created a "man" page for sloccount.
2001-01-06 David A. Wheeler <dwheeler, at, dwheeler.com>
* Added front-end tool "sloccount", GREATLY improving ease-of-use.
The tool "sloccount" invokes all the other SLOCCount tools
in the right order, performing a count of a typical project
or set of projects. From now on, this is expected to be the
"usual" interface, though the pieces will still be documented
to help those with more unusual needs.
From now on, "SLOCCount" is the entire package, and
"sloccount" is this front-end tool.
* Added "--datadir" option to make_filelists (to support
"sloccount").
* get_sloc: No longer displays languages with 0 counts.
* Documentation: documented "sloccount"; this caused major changes,
since "sloccount" is now the recommended interface for all but
those with complicated requirements.
* compute_filecount: minor optimization/simplication
2001-01-05 David A. Wheeler <dwheeler, at, dwheeler.com>
* Released vesion 1.2.
* Changed the name of many programs, as part of a general clean-up.
I changed "compute_all" to "compute_sloc", and eliminated
most of the other "compute_*" files (replacing it with
"compute_sloc_lang"). I also changed "get_data" to "get_sloc".
This is part of a general clean-up, so that
if someone wants to package this program for installation they
don't have a thousand tiny programs polluting the namespace.
Adding "sloc" to the names makes namespace collisions less likely.
I also worked to make the program simpler.
* Made a number of documentation fixes - my thanks to Clyde Roby
for giving me feedback.
* Changed all "*_count" programs to consistently print at the end
"Total:" on a line by itself, followed on the next line by
the total lines of code all by itself. This makes the new program
get_sloc_detail simpler to implement, and also enables
get_sloc_detail to perform some error detection.
* Changed name of compressed file to ".tar.gz" and modified docs
appropriately. The problem is a bug in Netscape 4.7 clients
running on Windows; it appears that ".tgz" files don't get fully
downloaded from my hosting webserver because no type information
is provided. Originally, I tried to change the website to fix this
by creating ".htaccess" files, but that didn't work with either:
AddEncoding x-gzip gz tgz
AddType application/x-tar .tgz
or:
AddEncoding application/octet-stream tgz
So, we'll switch to .tar.gz, which works.
My thanks to Christopher Lott for this feedback.
* Removed a few garbage files.
* Added information to documentation on how to handle HUGE sets
of data directory children, i.e., where you can't even use "*"
to list the data directory children. I don't have a directory
of that kind of scale, so I can't test it directly,
but I can at least discuss how to do it; it SHOULD work.
* Changed makefile so that "ChangeLog" is now visible on the web.
2001-01-04 David A. Wheeler <dwheeler, at, dwheeler.com>
* Minor fixes to documentation.
* Added "--crossdups" option to break_filelist.
* Documented count_unknown_ext.
* Created new tool, "get_sloc_detail", and documented it.
Now you can get a complete report of all the SLOC data in one big
file (e.g., for exporting to another tool for analysis).
2001-01-03 David A. Wheeler <dwheeler, at, dwheeler.com>
* First public release, version "1.0", of "SLOCCount".
Main website: http://www.dwheeler.com/sloccount
|