README.txt 14 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379
  1. README scopus
  2. Ak, 27.3.2017
  3. Get information from Scopus database.
  4. This queries work only with access to Scopus (e.g. from KIT LAN)
  5. Scopus service is not public vailable.
  6. Version history
  7. Version 1.0, 8.3.17 (ak):
  8. - initial version of a single script without any options
  9. It runs in 4 phases: get publiations for individual author groups,
  10. create posts, get all citations, create comments.
  11. - used with the test installation at the UFO server in March 2017
  12. Content
  13. info Documentation, website, etc
  14. readme.txt This file
  15. my_scopus.py List of scopus author ids
  16. ak_scopus.py Functions to access scopus
  17. ak_wordpress.py Functions to creates Wordpress posts + comments
  18. scopus-get-publications.py Script to query Scopus
  19. test-scopus.py Application with some functions to get publication entries
  20. Prints a list with some formatting
  21. test-scopus2.py Example from one of the website, only one query
  22. test-wp.py Test script for access to the wordpress API
  23. test-wp2.py Test script for wordpress - only query, no modification
  24. Usage:
  25. 1. Go to Scopus and retrieve the scopus author ids for the scientists in your group.
  26. Define the ids in my_scopus.py and group them.
  27. 2. Select one of more author groups in scopus-get-publications.py (main part at
  28. the end of the file). Check definition of database and wordpress installation.
  29. 3. Execute scopus-get-publications.py.
  30. python -W ignore scopus-get-publications.py
  31. Note: The -W ignore flag might be necessary if the INSERT IGNORE causes warnings.
  32. Example run:
  33. ufo:~/scopus # python -W ignore scopus-get-publications.py
  34. ***********************************************
  35. **** scopus-get-publications / 2017-03-27 *****
  36. ***********************************************
  37. === Update of publications for the author group: Computing
  38. Total number of publications: 54
  39. === Update of publications for the author group: X-ray Imaging
  40. Total number of publications: 39
  41. === Update of publications for the author group: Electronics
  42. Total number of publications: 132
  43. === Update of publications for the author group: Morphology
  44. Total number of publications: 21
  45. === Create posts for newly registered publication in scopus
  46. Nothing new found
  47. === Update citatation of all publication in the database
  48. Total number of publications is 281
  49. === Create comments for newly registered citations in scopus
  50. Number of new citations is 0
  51. Summary: (see also logfile /root/scopus/scopus-publications.log)
  52. Date = 2017-03-27 21:28:36.002624
  53. NPubs = 281
  54. NNewPubs = 0
  55. NCites = 4699
  56. NNewCites = 0
  57. Runtime = 0:00:11.496362
  58. Further enhancements
  59. Todo:
  60. - Reprocessing of all post, if the format has changed
  61. E.g. add button
  62. with Email to author or a new category has been added
  63. - Query only the latest citations for each publications not all.
  64. - Store JSON-Data of all publications
  65. - Get bibliographic information for display at the web page of a reseach group
  66. like UFO or may be also later for the DTS program.
  67. - Handle wrong publications in scopus for author with same name
  68. - Automatically include reports and student thesis by
  69. bibtex definition and upload on a server!?
  70. -> Would have the nice effect, that all student work is organized systematically!!!
  71. Structure of the database
  72. Both tables keep the reference to the publications in Scopus and the
  73. Wordpress ids. With this information, reprocessing is possible (but not
  74. implemented now).
  75. Table publocations:
  76. MariaDB [scopus]> describe publications;
  77. +--------------+--------------+------+-----+---------+----------------+
  78. | Field | Type | Null | Key | Default | Extra |
  79. +--------------+--------------+------+-----+---------+----------------+
  80. | id | int(11) | NO | PRI | NULL | auto_increment |
  81. | scopusid | varchar(255) | YES | UNI | NULL | |
  82. | wpid | int(11) | YES | | NULL | |
  83. | citedbycount | int(11) | YES | | NULL | |
  84. | citesloaded | int(11) | YES | | NULL | |
  85. | categories | varchar(255) | YES | | NULL | |
  86. | doi | varchar(255) | YES | | NULL | |
  87. | title | varchar(255) | YES | | NULL | |
  88. | abstract | text | YES | | NULL | |
  89. | bibtex | text | YES | | NULL | |
  90. | ts | datetime | YES | | NULL | |
  91. | scopusdata | text | YES | | NULL | |
  92. | eid | varchar(255) | YES | | NULL | |
  93. +--------------+--------------+------+-----+---------+----------------+
  94. Table citations:
  95. MariaDB [scopus]> describe citations;
  96. +--------------+--------------+------+-----+---------+----------------+
  97. | Field | Type | Null | Key | Default | Extra |
  98. +--------------+--------------+------+-----+---------+----------------+
  99. | id | int(11) | NO | PRI | NULL | auto_increment |
  100. | scopusid | varchar(255) | YES | | NULL | |
  101. | eid | varchar(255) | YES | | NULL | |
  102. | wpid | int(11) | YES | MUL | NULL | |
  103. | wpcommentid | int(11) | YES | | NULL | |
  104. | citedbycount | int(11) | YES | | NULL | |
  105. | citesloaded | int(11) | YES | | NULL | |
  106. | categories | varchar(255) | YES | | NULL | |
  107. | doi | varchar(255) | YES | | NULL | |
  108. | scopusdata | text | YES | | NULL | |
  109. | title | varchar(255) | YES | | NULL | |
  110. | abstract | text | YES | | NULL | |
  111. | bibtex | text | YES | | NULL | |
  112. | ts | datetime | YES | | NULL | |
  113. +--------------+--------------+------+-----+---------+----------------+
  114. Setup of scopus database in mysql
  115. create database scopus;
  116. CREATE USER 'scopus@localhost';
  117. grant all on scopus.* to 'scopus'@'localhost' identified by '$scopus$';
  118. # create tables
  119. mysql -u scopus -p scopus < create_scopus.sql
  120. Publications in Scopus:
  121. Sometime (unfortunately quite often) a author id in Scopus is not unique but
  122. identifies several researchers with the same name. E.g. Michele Caselle (3 persons)
  123. Matthias Balzer (2).
  124. This case is currently handled manually by deleting all publications from the unknown
  125. authors. Might be possible to implement also a black list??
  126. Sample data from Scopus:
  127. {
  128. "abstracts-retrieval-response": {
  129. "authors": {
  130. "author": [
  131. {
  132. "@_fa": "true",
  133. "@auid": "15076530600",
  134. "@seq": "1",
  135. "affiliation": {
  136. "@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
  137. "@id": "60102538"
  138. },
  139. "author-url": "http://api.elsevier.com/content/author/author_id/15076530600",
  140. "ce:given-name": "Suren",
  141. "ce:indexed-name": "Chilingaryan S.",
  142. "ce:initials": "S.",
  143. "ce:surname": "Chilingaryan",
  144. "preferred-name": {
  145. "ce:given-name": "Suren",
  146. "ce:indexed-name": "Chilingaryan S.",
  147. "ce:initials": "S.",
  148. "ce:surname": "Chilingaryan"
  149. }
  150. },
  151. {
  152. "@_fa": "true",
  153. "@auid": "35313939900",
  154. "@seq": "2",
  155. "affiliation": {
  156. "@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
  157. "@id": "60102538"
  158. },
  159. "author-url": "http://api.elsevier.com/content/author/author_id/35313939900",
  160. "ce:given-name": "Andreas",
  161. "ce:indexed-name": "Kopmann A.",
  162. "ce:initials": "A.",
  163. "ce:surname": "Kopmann",
  164. "preferred-name": {
  165. "ce:given-name": "Andreas",
  166. "ce:indexed-name": "Kopmann A.",
  167. "ce:initials": "A.",
  168. "ce:surname": "Kopmann"
  169. }
  170. },
  171. {
  172. "@_fa": "true",
  173. "@auid": "56001075000",
  174. "@seq": "3",
  175. "affiliation": {
  176. "@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60032633",
  177. "@id": "60032633"
  178. },
  179. "author-url": "http://api.elsevier.com/content/author/author_id/56001075000",
  180. "ce:given-name": "Alessandro",
  181. "ce:indexed-name": "Mirone A.",
  182. "ce:initials": "A.",
  183. "ce:surname": "Mirone",
  184. "preferred-name": {
  185. "ce:given-name": "Alessandro",
  186. "ce:indexed-name": "Mirone A.",
  187. "ce:initials": "A.",
  188. "ce:surname": "Mirone"
  189. }
  190. },
  191. {
  192. "@_fa": "true",
  193. "@auid": "35277157300",
  194. "@seq": "4",
  195. "affiliation": {
  196. "@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
  197. "@id": "60102538"
  198. },
  199. "author-url": "http://api.elsevier.com/content/author/author_id/35277157300",
  200. "ce:given-name": "Tomy",
  201. "ce:indexed-name": "Dos Santos Rolo T.",
  202. "ce:initials": "T.",
  203. "ce:surname": "Dos Santos Rolo",
  204. "preferred-name": {
  205. "ce:given-name": "Tomy",
  206. "ce:indexed-name": "Dos Santos Rolo T.",
  207. "ce:initials": "T.",
  208. "ce:surname": "Dos Santos Rolo"
  209. }
  210. },
  211. {
  212. "@_fa": "true",
  213. "@auid": "35303862100",
  214. "@seq": "5",
  215. "affiliation": {
  216. "@href": "http://api.elsevier.com/content/affiliation/affiliation_id/60102538",
  217. "@id": "60102538"
  218. },
  219. "author-url": "http://api.elsevier.com/content/author/author_id/35303862100",
  220. "ce:given-name": "Matthias",
  221. "ce:indexed-name": "Vogelgesang M.",
  222. "ce:initials": "M.",
  223. "ce:surname": "Vogelgesang",
  224. "preferred-name": {
  225. "ce:given-name": "Matthias",
  226. "ce:indexed-name": "Vogelgesang M.",
  227. "ce:initials": "M.",
  228. "ce:surname": "Vogelgesang"
  229. }
  230. }
  231. ]
  232. },
  233. "coredata": {
  234. "citedby-count": "0",
  235. "dc:description": "X-ray tomography has been proven to be a valuable tool for understanding internal, otherwise invisible, mechanisms in biology and other fields. Recent advances in digital detector technology enabled investigation of dynamic processes in 3D with a temporal resolution down to the milliseconds range. Unfortunately it requires computationally intensive recon- struction algorithms with long post-processing times. We have optimized the reconstruction software employed at the micro-tomography beamlines at KIT and ESRF. Using a 4 stage pipelined architecture and the computational power of modern graphic cards, we were able to reduce the processing time by a factor 75 with a single server. The time required to reconstruct a typical 3D image is reduced down to several seconds only and online visualization is possible for the first time.Copyright is held by the author/owner(s).",
  236. "dc:identifier": "SCOPUS_ID:84859045029",
  237. "dc:title": "Poster: A GPU-based architecture for real-time data assessment at synchrotron experiments",
  238. "link": [
  239. {
  240. "@_fa": "true",
  241. "@href": "http://api.elsevier.com/content/abstract/scopus_id/84859045029",
  242. "@rel": "self"
  243. }
  244. ],
  245. "prism:aggregationType": "Conference Proceeding",
  246. "prism:coverDate": "2011-12-01",
  247. "prism:doi": "10.1145/2148600.2148624",
  248. "prism:pageRange": "51-52",
  249. "prism:publicationName": "SC'11 - Proceedings of the 2011 High Performance Computing Networking, Storage and Analysis Companion, Co-located with SC'11",
  250. "prism:url": "http://api.elsevier.com/content/abstract/scopus_id/84859045029"
  251. }
  252. }
  253. }
  254. Installation of python, mysql et al:
  255. pip install python-wordpress-xmlrpc
  256. Konfiguration Webserver (muss man wohl nach jeder Installation neu machen!!!)
  257. /etc/apache2/httpd.conf:
  258. LoadModule userdir_module libexec/apache2/mod_userdir.so
  259. LoadModule php5_module libexec/apache2/libphp5.so
  260. Include /private/etc/apache2/extra/httpd-userdir.conf
  261. /etc/apache2/extra/httpd-userdir.conf:
  262. Include /private/etc/apache2/users/*.conf
  263. /etc/php.ini:
  264. pdo_mysql.default_socket= /tmp/mysql.sock
  265. mysql.default_socket = /tmp/mysql.sock
  266. mysqli.default_socket = /tmp/mysql.sock
  267. sh-3.2# apachectl restart
  268. Install website:
  269. Create archive with wp dublicator
  270. Save scopus database
  271. mysqldump -u scopus -p scopus > scopus-170322.sql
  272. Create database on remote system
  273. mysql:
  274. CREATE USER 'scopus'@'localhost' IDENTIFIED BY '$scopus$';
  275. GRANT ALL PRIVILEGES ON scopus.* TO 'scopus'@'localhost';
  276. CREATE DATABASE scopus;
  277. mysql -u scopus -p scopus < scopus-170322.sql
  278. Create database wp_ufo2;
  279. CREATE USER ‘ufo’@‘localhost' IDENTIFIED BY '$ipepdv$';
  280. GRANT ALL PRIVILEGES ON wp_ufo2.* TO ‘ufo’@‘localhost';
  281. CREATE DATABASE wp_ufo2;
  282. Import WP archive:
  283. mkdir ufo2
  284. chown -R wwwrun:www ufo2
  285. http://ufo.kit.edu/ufo2/installer.php
  286. Error: PHP module ZipArchive is missing
  287. Manual extraction is available in the advanced options !!!
  288. Installation Scopus-Scripts:
  289. pip install requests
  290. pip install python-wordpress-xmlrpc
  291. pip install pymysql
  292. Check configurations:
  293. scopus-get-piblications.py
  294. ak_wordpress.py