Andreas Kopmann 2162fb6fde Update of group members | 3 anni fa | |
---|---|---|
doc | 7 anni fa | |
etc | 3 anni fa | |
log | 3 anni fa | |
.gitignore | 6 anni fa | |
README.md | 5 anni fa | |
ak_scopus.py | 5 anni fa | |
ak_wordpress.py | 5 anni fa | |
config.py.sample | 6 anni fa | |
create_scopus.sql | 7 anni fa | |
rm-scopusid.py | 5 anni fa | |
scopus-update-database.py | 7 anni fa | |
scopus_get_publications.py | 6 anni fa | |
test-citations.py | 6 anni fa | |
test-citations2.py | 6 anni fa | |
test-scopus.py | 5 anni fa | |
test-scopus2.py | 6 anni fa | |
test-wp.py | 5 anni fa | |
test-wp2.py | 6 anni fa | |
update.sh | 6 anni fa |
Ak, 23.5.2017
Get information on publications of work groups from Elsevier's Scopus database for usage in websites. For each publication a post on a Wordpress CMS is created. Citations are mapped to Wordpress comments. The get-publication script is intended to run on a regualr basis (e.g. by cron).
Note: All scopus scripts run only with valid access to the Scopus database (e.g. from KIT LAN). The Scopus service is not public available.
Todo:
Version 1.3, 23.5.17 (ak):
Version 1.2, 24.4.17 (ak):
Version 1.1, 12.4.17 (ak):
Version 1.0, 8.3.17 (ak):
readme.md This file
config.py Site dependant configuration file (in GIT as config.py.sample)
my_scopus.py List of scopus author ids
ak_scopus.py Functions to access scopus
ak_wordpress.py Functions to creates Wordpress posts + comments
scopus-get-publications.py Script to query Scopus
scopus-update-database.py Synchronize database and available Wordpress posts
test-scopus.py Application with some functions to get publication entries
Prints a list with some formatting
test-scopus2.py Example from one of the website, only one query
test-wp.py Test script for access to the wordpress API
test-wp2.py Test script for wordpress - only query, no modification
etc Configuration files of different installations
info Documentation, website, etc (not in GIT)
log Log file use scopus-publications-<hostname>.log
Go to Scopus and retrieve the scopus author ids for the scientists in your group. Define the ids in etc/config-.py and group them.
Create a symbolic link
ln -s etc/config-<hostname>.py config.py
Select one of more author groups and define the list sc_workgroups in config.py Check the definition of database and wordpress installation.
Execute scopus-get-publications.py. python -W ignore scopus-get-publications.py
Note: The -W ignore flag might be necessary if the INSERT IGNORE causes warnings.
Example run:
ufo:~/scopus # python -W ignore scopus-get-publications.py
***********************************************
**** scopus-get-publications / 2017-03-27 *****
***********************************************
=== Update of publications for the author group: Computing
Total number of publications: 54
=== Update of publications for the author group: X-ray Imaging
Total number of publications: 39
=== Update of publications for the author group: Electronics
Total number of publications: 132
=== Update of publications for the author group: Morphology
Total number of publications: 21
=== Create posts for newly registered publication in scopus
Nothing new found
=== Update citatation of all publication in the database
Total number of publications is 281
=== Create comments for newly registered citations in scopus
Number of new citations is 0
Summary: (see also logfile /root/scopus/scopus-publications.log)
Date = 2017-03-27 21:28:36.002624
NPubs = 281
NNewPubs = 0
NCites = 4699
NNewCites = 0
Runtime = 0:00:11.496362
Todo:
For each site database, access to wordpress and the author profiles need to
be configured in config.py
.
For the UFO webpage the configuration looks like diplayed below. In the first part the access to the database is configured. The database is used as a cache to keep track which publications are already available in Wordpress.
In the second block the access to the Wordpress server is given. The
specified user (e.g. called scopus
) need to have editor permissions in order to
submit new posts and to suggest keywords.
For the access to scopus a key is required.
The main part of the configuration is dedicated to author identification and grouping of authors. At first variables for each author are defined. If a author is registered with more than one Scopus ID also these IDs should be added.
Finally in the variable sc_workgroups
a named list of all groups
of authors that should be considered is defined. The name of the groups
need to be defined in Wordpress as categories. The slug name of the
category in wordpress is used in the configuration file as identification
of the author group.
Config.py:
""" Scopus script's configration
*A Kopmann, 12.4.17*
Configuration for the active setup at ufo.kit.edu
"""
# Local publication database
db_host = 'localhost'
db_user = 'scopus'
db_pw = '$scopus$'
db_name = 'scopus'
# Access to Wordpress installation
wp_api_url = "https://ufo.kit.edu/dis/xmlrpc.php"
""" Access to the Wordpress installation """
wp_user = "scopus"
wp_password = "$scopus$"
# Reporting
log_file = "/root/scopus/log/scopus-publications-ufo-kit-edu.log"
""" Logfile name for reporting """
# Scopus query definition
MY_API_KEY = "14d431d052c2caf5e9c4b1ab7de7463d"
""" Scopus access key (Andreas Kopmann) """
DTS_API_KEY = "f2b35fe46478f22f3c14cf53f73d4f93"
# Scopus author IDs
# KIT, PDV
ak = "35313939900"
ak2 = "57193311016"
csa = "15076530600"
matthiasVogelgesang = "35303862100"
timoDritschler = "56473578500"
andreiShkarin = "56950893700"
nicholasTanJerome = "57200247965"
tillBergmann = "35308595100"
armenBeglarian = "55343303900"
petraRohr = "40561503300"
norbertKunka = "35276889200"
horstDemattio = "6506285395"
# KIT, EPS
micheleCaselle = "57194376511"
mc2 = "57194376512"
urosStevanovic = "55557712600"
lorenzoRota = "56473442500"
matthiasBalzer = "35519411500"
# KIT, IPE
marcWeber = "56654729000"
mw2 = "56603987800"
mw3 = "7404138824"
# KIT, IPE
marcWeber = "56654729000"
mw2 = "56603987800"
mw3 = "7404138824"
# KIT, IPS
tomyRolo = "56118820400"
tr2 = "35194644400"
tr3 = "35277157300"
tomasFarago = "56655045700"
alexyErshof = "56441809800"
romanShkarin = "56951331000"
tiloBaumbach = "7003270957"
thomasVandekamp = "46761453500"
danielHaenschke = "55532222200"
# TUD
michaelHeethoff = "55979397800"
sebastianSchmelzle = "34768986100"
# UHD, has been combined with another person in Munich !!!
philipLoesel = "57203423658"
# Others (e.g. for black list)
ashotChiligarian = "7004126133"
hansBluemer = "7006284555"
matthiasKleifegs = "6602072426"
# Definition of workgroups for automatic Scopus publication retrieval
sc_start = 2010
sc_citations = False
sc_keywords = True
sc_max_authors = 25
ufo_pdv = [ak, ak2, csa, matthiasVogelgesang, timoDritschler ]
ufo_eps = [matthiasBalzer, lorenzoRota, micheleCaselle, mc2 ]
ufo_ips = [tomyRolo, tr2, tr3, tomasFarago, danielHaenschke]
ufo_apps = [thomasVandekamp]
ufo_alg = [philipLoesel]
sc_workgroups = [
{'name':'computing','authors':ufo_pdv},
{'name':'electronics','authors':ufo_eps},
{'name':'x-ray-imaging','authors':ufo_ips},
{'name':'morphology','authors':ufo_apps},
{'name':'algorithms','authors':ufo_alg}
]
""" Definition of the workgroups
Each workgroup is defined by a list of Scopus ID's and the
name of the category to be used in Wordpress. The category
for a new workgroup has to be created in Wordpress before
adding publications
"""
Both tables keep the reference to the publications in Scopus and the Wordpress ids. With this information, reprocessing is possible (but not implemented now).
Table publications:
MariaDB [scopus]> describe publications;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| scopusid | varchar(255) | YES | UNI | NULL | |
| wpid | int(11) | YES | | NULL | |
| citedbycount | int(11) | YES | | NULL | |
| citesloaded | int(11) | YES | | NULL | |
| categories | varchar(255) | YES | | NULL | |
| doi | varchar(255) | YES | | NULL | |
| title | varchar(255) | YES | | NULL | |
| abstract | text | YES | | NULL | |
| bibtex | text | YES | | NULL | |
| ts | datetime | YES | | NULL | |
| scopusdata | text | YES | | NULL | |
| eid | varchar(255) | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
Table citations:
MariaDB [scopus]> describe citations;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| scopusid | varchar(255) | YES | | NULL | |
| eid | varchar(255) | YES | | NULL | |
| wpid | int(11) | YES | MUL | NULL | |
| wpcommentid | int(11) | YES | | NULL | |
| citedbycount | int(11) | YES | | NULL | |
| citesloaded | int(11) | YES | | NULL | |
| categories | varchar(255) | YES | | NULL | |
| doi | varchar(255) | YES | | NULL | |
| scopusdata | text | YES | | NULL | |
| title | varchar(255) | YES | | NULL | |
| abstract | text | YES | | NULL | |
| bibtex | text | YES | | NULL | |
| ts | datetime | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
Setup of scopus database in mysql
create database scopus;
CREATE USER 'scopus@localhost';
grant all on scopus.* to 'scopus'@'localhost' identified by '$scopus$';
# create tables
mysql -u scopus -p scopus < create_scopus.sql
Sometime (unfortunately quite often) a author id in Scopus is not unique but identifies several researchers with the same name. E.g. Michele Caselle (3 persons) Matthias Balzer (2).
This case is currently handled manually by deleting all publications from the unknown authors. Might be possible to implement also a black list??
Sample data from Scopus:
{
"abstracts-retrieval-response": {
"authors": {
"author": [
{
"@_fa": "true",
"@auid": "15076530600",
"@seq": "1",
"affiliation": {
"@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
"@id": "60102538"
},
"author-url": "http://api.elsevier.com/content/author/author_id/15076530600",
"ce:given-name": "Suren",
"ce:indexed-name": "Chilingaryan S.",
"ce:initials": "S.",
"ce:surname": "Chilingaryan",
"preferred-name": {
"ce:given-name": "Suren",
"ce:indexed-name": "Chilingaryan S.",
"ce:initials": "S.",
"ce:surname": "Chilingaryan"
}
},
{
"@_fa": "true",
"@auid": "35313939900",
"@seq": "2",
"affiliation": {
"@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
"@id": "60102538"
},
"author-url": "http://api.elsevier.com/content/author/author_id/35313939900",
"ce:given-name": "Andreas",
"ce:indexed-name": "Kopmann A.",
"ce:initials": "A.",
"ce:surname": "Kopmann",
"preferred-name": {
"ce:given-name": "Andreas",
"ce:indexed-name": "Kopmann A.",
"ce:initials": "A.",
"ce:surname": "Kopmann"
}
},
{
"@_fa": "true",
"@auid": "56001075000",
"@seq": "3",
"affiliation": {
"@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60032633",
"@id": "60032633"
},
"author-url": "http://api.elsevier.com/content/author/author_id/56001075000",
"ce:given-name": "Alessandro",
"ce:indexed-name": "Mirone A.",
"ce:initials": "A.",
"ce:surname": "Mirone",
"preferred-name": {
"ce:given-name": "Alessandro",
"ce:indexed-name": "Mirone A.",
"ce:initials": "A.",
"ce:surname": "Mirone"
}
},
{
"@_fa": "true",
"@auid": "35277157300",
"@seq": "4",
"affiliation": {
"@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
"@id": "60102538"
},
"author-url": "http://api.elsevier.com/content/author/author_id/35277157300",
"ce:given-name": "Tomy",
"ce:indexed-name": "Dos Santos Rolo T.",
"ce:initials": "T.",
"ce:surname": "Dos Santos Rolo",
"preferred-name": {
"ce:given-name": "Tomy",
"ce:indexed-name": "Dos Santos Rolo T.",
"ce:initials": "T.",
"ce:surname": "Dos Santos Rolo"
}
},
{
"@_fa": "true",
"@auid": "35303862100",
"@seq": "5",
"affiliation": {
"@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
"@id": "60102538"
},
"author-url": "http://api.elsevier.com/content/author/author_id/35303862100",
"ce:given-name": "Matthias",
"ce:indexed-name": "Vogelgesang M.",
"ce:initials": "M.",
"ce:surname": "Vogelgesang",
"preferred-name": {
"ce:given-name": "Matthias",
"ce:indexed-name": "Vogelgesang M.",
"ce:initials": "M.",
"ce:surname": "Vogelgesang"
}
}
]
},
"coredata": {
"citedby-count": "0",
"dc:description": "X-ray tomography has been proven to be a valuable tool for understanding internal, otherwise invisible, mechanisms in biology and other fields. Recent advances in digital detector technology enabled investigation of dynamic processes in 3D with a temporal resolution down to the milliseconds range. Unfortunately it requires computationally intensive recon- struction algorithms with long post-processing times. We have optimized the reconstruction software employed at the micro-tomography beamlines at KIT and ESRF. Using a 4 stage pipelined architecture and the computational power of modern graphic cards, we were able to reduce the processing time by a factor 75 with a single server. The time required to reconstruct a typical 3D image is reduced down to several seconds only and online visualization is possible for the first time.Copyright is held by the author/owner(s).",
"dc:identifier": "SCOPUS_ID:84859045029",
"dc:title": "Poster: A GPU-based architecture for real-time data assessment at synchrotron experiments",
"link": [
{
"@_fa": "true",
"@href": "http://api.elsevier.com/content/abstract/scopus_id/84859045029",
"@rel": "self"
}
],
"prism:aggregationType": "Conference Proceeding",
"prism:coverDate": "2011-12-01",
"prism:doi": "10.1145/2148600.2148624",
"prism:pageRange": "51-52",
"prism:publicationName": "SC'11 - Proceedings of the 2011 High Performance Computing Networking, Storage and Analysis Companion, Co-located with SC'11",
"prism:url": "http://api.elsevier.com/content/abstract/scopus_id/84859045029"
}
}
}
Installation of python, mysql et al:
pip install python-wordpress-xmlrpc
Konfiguration Webserver (muss man wohl nach jeder Installation neu machen!!!)
/etc/apache2/httpd.conf:
LoadModule userdir_module libexec/apache2/mod_userdir.so
LoadModule php5_module libexec/apache2/libphp5.so
Include /private/etc/apache2/extra/httpd-userdir.conf
/etc/apache2/extra/httpd-userdir.conf:
Include /private/etc/apache2/users/*.conf
/etc/php.ini:
pdo_mysql.default_socket= /tmp/mysql.sock
mysql.default_socket = /tmp/mysql.sock
mysqli.default_socket = /tmp/mysql.sock
sh-3.2# apachectl restart
Install website:
Create archive with wp dublicator
Save scopus database
mysqldump -u scopus -p scopus > scopus-170322.sql
Create database on remote system
mysql:
CREATE USER 'scopus'@'localhost' IDENTIFIED BY '$scopus$';
GRANT ALL PRIVILEGES ON scopus.* TO 'scopus'@'localhost';
CREATE DATABASE scopus;
mysql -u scopus -p scopus < scopus-170322.sql
Create database wp_ufo2;
CREATE USER ‘ufo’@‘localhost' IDENTIFIED BY '$ipepdv$';
GRANT ALL PRIVILEGES ON wp_ufo2.* TO ‘ufo’@‘localhost';
CREATE DATABASE wp_ufo2;
Import WP archive:
mkdir ufo2
chown -R wwwrun:www ufo2
Run the installer:
http://ufo.kit.edu/ufo2/installer.php
Installation Scopus-Scripts:
pip install requests
pip install python-wordpress-xmlrpc
pip install pymysql
Check configurations:
scopus-get-piblications.py
ak_wordpress.py
Sometimes there are errors in the database. This case required manual intervention.
Eamples of error that have been observed: