Guide¶
Ananse requires python 3.7 or higher
Setup¶
Using pip¶
pip install ananse
Directly from the repository¶
git clone https://github.com/baasare/ananse.git
python ananse/setup.py install
Quick start¶
Writing your own script
from ananse import Ananse
min_len = 1 # minimum keyword length
max_len = 4 # maximum keyword length
# Create an instance of the package
test_run = Ananse()
# Import your naive search results from the current working directory
imports = test_run.import_naive_results(path="./")
# Columns to deduplicate imported search results
columns = ['title', 'abstract']
# De-duplicate the imported search results
data = test_run.deduplicate_dataframe(imports, columns)
# Extract keywords from article title and abstract as well as author and database tagged keywords
all_terms = test_run.extract_terms(data, min_len=min_len, max_len=max_len)
# Create Document-Term Matrix, with columns as terms and rows as articles
dtm, term_column = test_run.create_dtm(data.text, keywords=all_terms, min_len=max_len, max_len=max_len)
# Create co-occurrence network using Document-Term Matrix
graph_network = test_run.create_network(dtm, all_terms)
# Plot histogram and node strength of the network
test_run.plot_degree_histogram(graph_network)
test_run.plot_degree_distribution(graph_network)
# Determine cutoff for the relevant keywords
cutoff_strengths = test_run.find_cutoff(graph_network, "spline", "degree", degrees=3, knot_num=1, percent=0.879956,
diagnostics=True)
# Get suggested keywords and save to a csv file
suggested_keywords = test_run.get_keywords(graph_network, "degree", cutoff_strengths, save_keywords=True)
# Print suggested keywords
for word in suggested_keywords:
print(word)
Using Ananse Test Script
python tests/ananse_test