Welcome to FolderAnalyse's documentation! ========================================= .. toctree:: :maxdepth: 2 :caption: Contents: source/modules.rst Installation ============ FolderAnalyse requires Python 3.6 or above, and has been tested on Linux and MacOS. First, to install Python, visit Python.org, or try the Anaconda.org distribution. Once Python is installed, simply install the package using the Python package manager pip, by running on the command line: .. code-block:: bash pip install FolderAnalyse Alternatively, you can install the development version from GitHub via: .. code-block:: bash pip install git+https://github.com/rpep/FolderAnalyse Command Line Use ================ Here we give some information about the module FolderAnalyse. To generate statistics about a particular file: .. code-block:: bash echo "The quick brown fox jumped over the lazy dog" > test.txt FolderAnalyse test.txt You should see some output like: .. code-block:: bash File "test.txt" Top 10 Word Frequencies --------------------------------------- 1. The, 1 2. quick, 1 3. brown, 1 4. fox, 1 5. jumped, 1 6. over, 1 7. the, 1 8. lazy, 1 9. dog, 1 To generate statistics about all files in a folder: .. code-block:: bash FolderAnalyse /path/to/a/folder To generate statistics about all "\*.md" files in a folder: .. code-block:: bash FolderAnalyse /path/to/a/folder -t ".md" To save the outputted text as a report: .. code-block:: bash FolderAnalyse /path/to/a/folder -s report.txt The tests for the project can be run from the command line with: .. code-block:: bash FolderAnalyse . -r For the test cases I made use of out-of-copyright Project Gutenberg books as useful reference cases. These are included in the tests/example_docs folder. API === In general FolderAnalyse is designed to be used from the command line, but here I'll show how you can use the functions in your own projects. The bulk of the interesting code is in :mod:`FolderAnalyse.process`, in the two functions :func:`~FolderAnalyse.process.process_file` and :func:`~FolderAnalyse.process.process_dir`. To process a file and get the frequency dictionary, simply: .. code-block:: python >>> import FolderAnalyse.process as p >>> f1 = open('test1.txt', 'w') >>> f1.write("The quick brown fox jumped over the lazy dog") >>> f1.close() >>> stats_text, frequency_dict, top_freqs = p.process_file('test.txt', N=5, case_sensitive=False) >>> print(top_freqs['the']) 2 If we create another file, we can use directory processing: .. code-block:: python >>> f2 = open('test2.txt', 'w') >>> f2.write("Writing words to the second file") >>> f2.close() # See the API documentation for more details: >>> text, dics, top_dic, cdic, top_cdic = sp.process_dir('.') >>> print(top_cdic['the']) 3 If the word counts are all that is required, this can be handled just using the function :func:`~FolderAnalyse.fileparser.parse`. .. code-block:: python >>> import FolderAnalyse.fileparser as fp >>> print(fp.parse('test2.txt', sort=True)) {'writing': 1, 'words': 1, 'to': 1, 'the': 1, 'second': 1, 'file': 1} The tests for the project can be run directly from the Python interpreter with: .. code-block:: python >>> import FolderAnalyse >>> FolderAnalyse.runtests() Indices and tables ================== * :ref:`genindex` * :ref:`modindex` * :ref:`search`