bibtexformat
Download bibtexformat.zip (source and documentation)
Download bibtexformat.pdf (documentation only)
For support, feature requests and bug reports please send an email to
.
A minimal library file with the affected references is usually very helpful.
Update (June 28th 2010)
- The predefined substitutions of common symbols and accented characters were turned optional so that users are able to define *all* substitutions in the configuration file. It is now possible, for example, to keep umlauts if UTF-8 is supported throughout the workflow.
- The use of fields without any delimiters (that is braces or double quotes) has been implemented. This facilitates the use of BibTeX macros/strings, which must not be enclosed in any delimiters. The delimiter type conversion was also made more robust to avoid creating invalid entries when quotation marks are present in the field's value.
-
The feature to ignore BibTeX fields (-s option) was greatly extended. Apart from the general list that works on all items, there can now be other lists to configure reference types individually. It is now possible to configure, for example, a separate list for
@miscitems or slim down each BibTeX item to its mandatory fields only. The way these lists are defined is also way more comfortable than before. - A reference type correction for
@inproceedingswas added. Additionally, the default type corrections can now be redefined and new ones freely defined in the configuration file.
Update (March 18th 2010)
A really stupid bug was fixed concerning the cite key generation. The cite key format if page numbers are used (-pn) is Author:Year:Page. If a page range is given, the start page is extracted and used, however, due to the bug the page was ignored if only the start page was given in the first place. This went unnoticed for several months as most citation databases provide the proper citation with the range—I had one case out of 450.
The correction of this bug causes cite keys of papers given without a page range to change. I apologize for any inconvenience.
Update (January 12th 2010)
- -filecheck option added to check linked files for existence.
- -filedir option added to define the folder linked files are contained in.
- The conversion between local-url and file fields was fixed.
Important: If you are already a user of the script and update to the newest version, please have a look at page 3 in the manual as some command line options and settings in the configuration file have changed.
Purpose
bibtexformat is a Perl script for the handling of BibTeX libraries. The required libraries to use it are contained in the tarball and are installed by an installation script.
If you use LaTeX you have to define a cite key for each paper you want to cite in your document (you cite using \cite{cite_key}). Many popular library managers (e.g. Endnote, Reference Manager or Papers) do not provide the possibility to create the cite keys automatically following certain rules. If you have the luck of using a Mac, the software Papers is more than highly recommended. It surpasses Endnote and others in more than just a couple of ways, but, as of v1.9.3, also does not offer user-defined cite key generation.
If you have or want to use one of the mentioned reference managers, you either have to create the cite keys manually (prone to mistakes) or import the exported BibTeX library into another program like JabRef or BibDesk to create the keys automatically. Every time you update your library in the original reference manager, this has to be repeated.
The script bibtexformat reads in the exported BibTeX library, and performs (among many others) the following actions:
- The cite keys are created and added either using Author:Year:FirstPage style. This style ensures unique keys quite safely, double entries are spotted of course.
- Several format changes are carried out to make the library look nicer:
- The equal signs of BibTeX fields are aligned.
- A defined number of empty lines is added between items.
- The library can be sorted alphabetically according to the cite keys.
- Multiple line entries can be contracted in a single line
- Long lines can be wrapped and added a defined indentation.
- BibTeX keywords (types and fields) can be changed to upper- or lowercase.
- Unwanted fields can be removed (e.g. abstract, notes, keywords, doi, and everything that is only needed in the reference software but not in the BibTeX library).
- ASCII garbage from symbols added in the reference manager is replaced with the correct BibTeX symbols (most German, Danish, French and Spanish special characters and all uppercase and lowercase Greek letters). Altogether, more than 120 symbols are replaced. This works only on Mac OS, though, since Windows uses its "Symbol" True Type font and this information is lost during the export.
- Journal titles can be added using Endnote term lists (available for virtually every subject and obtainable from the web). Multiple files as well as two different abbreviations are supported and various warnings (unknown title, no abbreviation given) displayed.
- Several BibTeX integrity checks can be performed:
- Check for author names with multiple parts (e.g. John van Doe Jr.).
- Check for multiple cite keys.
- Check for missing mandatory fields in each BibTeX type.
- Reference types can be replaced (including renaming of the fields). For example, if the term Thesis or Dissertation is found as journal name, the BibTeX item
@articleis changed to@phdthesis, along with the respective fields. This is a feature needed for exports from Papers before v2.0, since the exported reference types are incorrect. - External files (linked via local-url and file) can be checked for existence.
- The list of all used authors, journals and reference types can be printed.
- The BibTeX field delimiters can be exchanged (either curly braces or double quotes).
- The letter case of the titles can be protected by enclosing them in curly braces.
- Page ranges can be expanded (e.g., 123–8 to 123–128).
The benefit of using the script instead of JabRef or BibDesk is that running the script takes about a second and settings can be changed to your liking with a few changes in the configuration file or the runscript, if necessary. It already replaces several expressions often used in computational and organic chemistry with the correct LaTeX formulas and this list can be easily expanded.
Usage Output
Please refer to the rather extensive manual for more details.
bibtexformat - automated substitutions and format changes of BibTeX library files
$Revision: 4863 $
$Date: 2010-01-09 19:07:13 +0100 (Sat, 09 Jan 2010) $
Generates and adds the labels (cite keys) to a BibTeX library file, abbreviates journal
titles, filters out unneeded items, performs string replacements and checks author names
for format errors that may lead to incorrect citations.
Usage: bibtexformat infile(s) [options] [-o outfile]
-o Define an output file (recommended).
A leading dot defines an extension squeezed between the original extension
and the filename (e.g. -o .new => Library.new.bib, as shortcut and needed
for multiple file processing)
-s short library, leaves out abstract, keywords, etc. (see configuration file)
-labels create the labels (cite keys)
-pn use the first page number to create unambiguous labels of the type
Author:Year:Page
-fy use the full year instead of only the last two digits (mind that it may
not always be given)
-sep define the separator for the labels (default is ":")
-f force the generation of labels, even if already defined
(this overwrites existing labels)
-rangefix expand page numbers, e.g. 723-7 to 723-727
-typereset change all entry types to @article (for a export by Papers)
-typefix change BibTeX types depending on certain keywords
-protitle protect the case of the title by enclosing it with double braces
-autcheck check the authors for correct division into first, last, von and Jr part
-autfix perform the user-defined corrections of author names
-autmax define a maximum number of authors before the list is shortened to et al.
-autlist print the number and a list of all authors
-joulist print the number and a list of all journals
-typelist print a list of all found BibTeX types
-subst perform user-defined substitutions in the titles
-sort sort the items alphabetically according to their BibTeX label
-nl newlines between BibTeX items, default is 2
-quotes convert the field delimiters to quotes
-braces convert the field delimiters to braces
-format re-format the library to improve readability
-lb leading blanks before a field descriptor, default is 3
-ep position of equal signs (including leading blanks), default is 15
-lc format field keywords ("author") and reference types ("@book") lowercase
-uc format field keywords ("author") and reference types ("@book") uppercase
-combine combine multiline entries in one line
-wrap 80 wraps the line at, for example, column 80 (indentation is considered)
-abb abbreviate the journal titles, one or more files containing the abbreviations
can be given and also defined by default in the script
-abb1 use the first abbreviation given in the abbreviation files (default)
-abb2 use the second abbreviation given in the abbreviation files
(usually without periods)
-full use the full journal title and replace all abbreviated titles with it
-local2file convert all file links from 'local-url' to 'file' entries (e.g. JabRef)
-file2local convert all file links from 'file' to 'local-url' entries (e.g. Papers)
-filecheck check the existence of all referenced files (file =)
-filedir base directory of the files for -filecheck
-conf read in a configuration file different from the one defined in the source
-log write all output to .log instead of STDOUT/STDERR.
If -pn is omitted, then the labels are generated using the first author, a colon,
the year (two digits) and a lowercase letter to avoid duplicates, e.g.
Author:99
Author:03a
Author:03b
If :03 exists and a second match is found, then the first one is renamed :03a
and the second becomes :03b.
In order to always have the labels of an Endnote library assigned in the same
sequence, even if new references are added and the library is exported again,
it is IMPORTANT that the Endnote library is sorted according to the RECORD NUMBERS!
Generally, using the page number is strongly recommended.
|
|
|
|
|