Derive a file name according to old file name cues and/or PDF file content
Find a file
2016-03-07 13:52:47 +01:00
.gitignore 10er GVB file type, better import error handling 2016-03-06 17:44:24 +01:00
__init__.py initial commit with stub 2016-03-05 11:56:29 +01:00
example_call.sh added verbose to example call 2016-03-06 15:31:02 +01:00
guessfilename.py fuzzy_contains_all_of 2016-03-07 13:52:47 +01:00
guessfilename_test.py fuzzy_contains_all_of 2016-03-07 13:52:47 +01:00
guessfilename_test.sh initial commit with stub 2016-03-05 11:56:29 +01:00
LICENSE Initial commit 2016-03-06 19:37:20 +01:00
README.org fixed GitHub repo path 2016-03-06 19:41:55 +01:00

## Time-stamp: <2016-03-06 19:41:02 vk> ## -- coding: utf-8 -- ## This file is best viewed with GNU Emacs Org-mode: http://orgmode.org/

guessfilename.py

This Python script tries to come up with a new file name for each file from command line argument.

It does this with several methods: first, the current file name is analyzed and any ISO date/timestamp and filetags are re-used. Secondly, if the parsing of the file name did not lead to any new file name, the content of the file is analyzed. Following file types are supported by now:

  • PDF files

The script accepts an arbitrary number of files (see your shell for possible length limitations).

Why

I do scan almost all paper mail. Many of those documents are sent to me regularily. Such documents are bills or insurance informations, for example.

Being too lazy to name those files manually with high chances of getting many variants for the same document type, I came up with a method to derive file names from either the old file name (cues I enter without knowing the exact target file name) or the file content.

Analyzing the content enables this script to recognize bills via customer numbers or phone numbers, amounts to pay, and so on.

Usage

guessfilename.py a_file_name.txt

… FIXXME

For a complete list of parameters, please try:

guessfilename.py --help

Related tools and workflows

This tool is part of a tool-set which I use to manage my digital files such as photographs. My work-flows are described in this blog posting you might like to read.

In short:

For tagging, please refer to filetags and its documentation.

See date2name for easily adding ISO time-stamps or date-stamps to files.

For easily naming and tagging files within file browsers that allow integration of external tools, see appendfilename (once more) and filetags.

Moving to the archive folders is done using move2archive.

Having tagged photographs gives you many advantages. For example, I automatically choose my desktop background image according to the current season.

Files containing an ISO time/date-stamp gets indexed by the filename-module of Memacs.

Contribute!

I am looking for your ideas!

If you want to contribute to this cool project, please fork and contribute!