bibtexformat
Download bibtexformat.zip (source and documentation)
Download bibtexformat.pdf (documentation only)
For support, feature requests and bug reports please send an email to
.
A minimal library file with the affected references is usually very helpful.
bibtexformat is – and will always be! – freeware. If you find it useful and wish to say thanks, then you can do so by contributing a little to the costs of keeping the website online.
Update March 9th 2011
- The configuration files now have the extension .cfg instead of none at all. Users keeping their old data should add the extension at least to the main configuration file (the other files are defined therein, so they are fine).
- The infile instruction was added to the syntax of -aopfix (Section 7.4, page 31) and -typefix (Section 7.2, page 26) configuration blocks. This allows to perform, for example, reference type changes or replacements of field names depending on a list of strings or regular expressions read from text files.
Update February 18th 2011
- A minor 2-byte fix with huge effect:
pagesfields ending with 2 or more dashes lost their closing brace. Fixed. - Empty lines in entries (i.e. paragraph breaks) are now preserved by -defsubst. This also fixed a resulting bug with -s.
Update February 5th 2011
-
The option -fieldregex allows to use one or more regular expressions on specific BibTeX fields. This is more selective than the usual substitutions that are used on every complete line of the library.
-
The option -fileregex performs regular expression tasks on the file paths given in file fields. The function can be used, for example, to turn relative paths in absolute ones and vice versa.
-
The option -fpapers complements the -f option. While the latter forces all labels to be re-created – thus overwriting all existing ones – -fpapers only forces the change of existing labels with the standard format of Papers (Author:2000p1749). This preserves all user-defined labels.
-
With the -aopfix option it is possible to fix citations of ahead-of-print articles. Depending on multiple conditions such as missing pages and doi present, one can effect necessary changes to correct the citation.
-
The replacement of fields during the reference type correction was fixed and cannot lead to double field entries anymore.
- A bug was fixed concerning command line parameters that are also defined in the config- uration file. Now the command line has highest precedence and overrides settings in the configuration file.
Update (June 28th 2010)
- The predefined substitutions of common symbols and accented characters were turned optional (-defsubst) so that users are able to define *all* substitutions in the configuration file. It is now possible, for example, to keep umlauts if UTF-8 is supported throughout the workflow.
- The use of fields without any delimiters (that is braces or double quotes) has been implemented. This facilitates the use of BibTeX macros/strings, which must not be enclosed in any delimiters. The delimiter type conversion was also made more robust to avoid creating invalid entries when quotation marks are present in the field's value.
-
The feature to ignore BibTeX fields (-s option) was greatly extended. Apart from the general list that works on all items, there can now be other lists to configure reference types individually. It is now possible to configure, for example, a separate list for
@miscitems or slim down each BibTeX item to its mandatory fields only. The way these lists are defined is also way more comfortable than before. - A reference type correction for
@inproceedingswas added. Additionally, the default type corrections can now be redefined and new ones freely defined in the configuration file.
Purpose
bibtexformat is a Perl script for the handling of BibTeX libraries. The required libraries to use it are contained in the tarball and are installed by an installation script.
If you use LaTeX you have to define a cite key for each paper you want to cite in your document (you cite using \cite{cite_key}). Many popular library managers (e.g. Endnote, Reference Manager or Papers) do not provide the possibility to create the cite keys automatically following certain rules. If you have the luck of using a Mac, the software Papers is more than highly recommended. It surpasses Endnote and others in more than just a couple of ways, but, as of v1.9.3, also does not offer user-defined cite key generation.
If you have or want to use one of the mentioned reference managers, you either have to create the cite keys manually (prone to mistakes) or import the exported BibTeX library into another program like JabRef or BibDesk to create the keys automatically. Every time you update your library in the original reference manager, this has to be repeated.
The script bibtexformat reads in the exported BibTeX library, and performs (among many others) the following actions:
- The cite keys are created and added either using Author:Year:FirstPage style. This style ensures unique keys quite safely, double entries are spotted of course.
- Several format changes are carried out to make the library look nicer:
- The equal signs of BibTeX fields are aligned.
- A defined number of empty lines is added between items.
- The library can be sorted alphabetically according to the cite keys.
- Multiple line entries can be contracted in a single line
- Long lines can be wrapped and added a defined indentation.
- BibTeX keywords (types and fields) can be changed to upper- or lowercase.
- Unwanted fields can be removed (e.g. abstract, notes, keywords, doi, and everything that is only needed in the reference software but not in the BibTeX library). This can also be done selectively, for example, the note field can be kept for @misc entires but deleted in all others.
- Regular expressions can be used on specific fields to, for example, turn relative path names of files into absolute paths and vice versa.
- ASCII garbage from symbols added in the reference manager is replaced with the correct BibTeX symbols (most German, Danish, French and Spanish special characters and all uppercase and lowercase Greek letters). Altogether, more than 100 symbols are replaced. This works only on Mac OS, though, since Windows uses its "Symbol" True Type font and this information is lost during the export.
- Journal titles can be added using Endnote term lists (available for virtually every subject and obtainable from the web). Multiple files as well as two different abbreviations are supported and various warnings (unknown title, no abbreviation given) displayed.
- Several BibTeX integrity checks can be performed:
- Check for author names with multiple parts (e.g. John van Doe Jr.).
- Check for multiple cite keys.
- Check for missing mandatory fields in each BibTeX type.
- Reference types can be replaced (including renaming of the fields). For example, if the term Thesis or Dissertation is found as journal name, the BibTeX item
@articleis changed to @phdthesis, along with the respective fields. This is a feature needed for exports from Papers before v2.0, since the exported reference types are incorrect. - External files (linked via local-url and file) can be checked for existence.
- The list of all used authors, journals and reference types can be printed.
- The BibTeX field delimiters can be exchanged (either curly braces or double quotes).
- The letter case of the titles can be protected by enclosing them in curly braces.
- Page ranges can be expanded (e.g., 123–8 to 123–128).
The benefit of using bibtexformat instead of JabRef or BibDesk is that running the script takes about a second and settings can be changed to your liking with a few changes in the configuration file or the runscript, if necessary. It already replaces several expressions often used in computational and organic chemistry with the correct LaTeX formulas and this list can be easily expanded.
Usage Output
Please refer to the rather extensive manual for more details.
bibtexformat - automated substitutions and format changes of BibTeX library files $Revision: 4895 $ $Date: 2011-02-04 19:05:55 +0100 (Fri, 04 Feb 2011) $ Generates and adds the labels (cite keys) to a BibTeX library file, abbreviates journal titles, filters out unneeded items, performs string replacements and checks author names for format errors that may lead to incorrect citations. Usage: bibtexformat infile(s) [options] [-o outfile] -o Define an output file (recommended). A leading dot defines an extension squeezed between the original extension and the filename (e.g. -o .new => Library.new.bib, as shortcut and needed for multiple file processing) -s short library, leaves out abstract, keywords, etc. (see configuration file) -labels create the labels (cite keys) for items without one -pn use the first page number to create unambiguous labels of the type Author:Year:Page -fy use the full year instead of only the last two digits (mind that it may not always be given) -sep define the separator for the labels (default is ":") -f force the generation of labels, even if already defined (this overwrites all existing labels) -fpapers force the generation of labels for items with a Papers default label (Author:2000p1234) while keeping all other preexisting ones -rangefix expand page numbers, e.g. 723-7 to 723-727 -protitle protect the case of the title by enclosing it with double braces -typecheck check all items for mandatory fields of the respective BibTeX type -typereset change all entry types to @article (for an export by Papers) -typefix change BibTeX types depending on certain trigger words (requires configuration in /Volumes/Home/benb/bin/bibtexformat/configuration) -aopfix correct ahead-of-print publications without page numbers by using the DOI instead (requires configuration in /Volumes/Home/benb/bin/bibtexformat/configuration) -autcheck check the authors for correct division into first, last, von and Jr part -autfix perform the user-defined corrections of author names -autmax define a maximum number of authors before the list is shortened to et al. -autlist print the number and a list of all authors -joulist print the number and a list of all journals -typelist print a list of all found BibTeX types -defsubst perform the default substitutions in the titles -subst perform user-defined substitutions in the titles -fieldregex uses a regular expression on particular fields: -fieldregex fieldname "from" "to" fieldname "from" "to" [...] -fileregex uses a number of regular expressions on each file in the 'file' fields: -fileregex "from1" "to1" "from2" "to2" [...] "^" matches the beginning of the filenames "" to replace something with an empty string -sort sort the items alphabetically according to their BibTeX label -nl newlines between BibTeX items, default is 2 -quotes convert the field delimiters to quotes -braces convert the field delimiters to braces -format re-format the library to improve readability -lb leading blanks before a field descriptor, default is 3 -ep position of equal signs (including leading blanks), default is 15 -lc format field keywords ("author") and reference types ("@book") lowercase -uc format field keywords ("author") and reference types ("@book") uppercase -combine combine multiline entries in one line -wrap 80 wraps the line at, for example, column 80 (indentation is considered) -abb abbreviate the journal titles, one or more files containing the abbreviations can be given and also defined by default in the script -abb1 use the first abbreviation given in the abbreviation files (default) -abb2 use the second abbreviation given in the abbreviation files (usually without periods) -full use the full journal title and replace all abbreviated titles with it -local2file convert all file links from 'local-url' to 'file' entries (e.g. JabRef) -file2local convert all file links from 'file' to 'local-url' entries (e.g. Papers) -filecheck check the existence of all referenced files (file =) -filedir base directory of the files for -filecheck -conf read in a configuration file different from the one defined in the source -log write all output to .log instead of STDOUT/STDERR. Important information for the label generation using -labels: If -pn is omitted, then the labels are generated using the first author, a colon, the year (two digits) and a lowercase letter to avoid duplicates, e.g. Author:99 Author:03a Author:03b If :03 exists and a second match is found, then the first one is renamed :03a and the second becomes :03b. In order to always have the labels of an Endnote library assigned in the same sequence, even if new references are added and the library is exported again, it is IMPORTANT that the Endnote library is sorted according to the record numbers! Generally, using the page number is strongly recommended (-pn option).
|
|
|
|
|