Tweeking Nikola CSS to Improve Table Display

Crappy default styling of tables was fixed by adding this file:

~nik/mysite/files/assets/css/custom.css **

/* Tweeks by Aubrey Moore */

td {
    padding: 15px;
}

th {
    text-align: center;
    padding: 15px;
}

tr:nth-child(even){background-color: #f2f2f2}

Using Scrapy to Find a String in a Web Site

Last updated Sunday, 12. February 2017 07:53AM

I wanted to find pages on the University of Guam College of Natural and Life Sciences Web Site containing a specific string. This short python script, which uses the scrapy framework, does the trick:

test_spider.py

from scrapy.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors import LinkExtractor

class someSpider(CrawlSpider):
  name = 'crawltest'
  allowed_domains = ['cnas-re.uog.edu']
  start_urls = ['http://cnas-re.uog.edu']
  rules = ( Rule(LinkExtractor(allow=()), follow=True,callback='parse_item'), )

  def parse_item(self, response):
    target = 'bell pepper'
    log = 'test_spider_log.md'
    if target in str(response.body):
      with open(log, 'a') as f:
        f.write('**{} was found in <{}>\n'.format(target, response.url))
    return

Executed from the command line using:

scrapy runspider test_spider.py -s DEPTH_LIMIT=2

Output: test_spider_log.md

bell pepper was found in http://cnas-re.uog.edu/soils-of-guam/

bell pepper was found in http://cnas-re.uog.edu/cnas-publications/?auth=&limit=17&tgid=&type=&usr=&yr=

bell pepper was found in http://cnas-re.uog.edu/cnas-publications/?auth=&tgid=115&type=&usr=&yr=

bell pepper was found in http://cnas-re.uog.edu/cnas-publications/?auth=&tgid=66&type=&usr=&yr=


Publish Nikola Site on GitHub

Publishing my Nikola site on github pages was remarkably easy.

git init .
git remote add origin git@github.com:aubreymoore/blog.git
  • Pushed my site to github from the command line:
nikola github_deploy

Taxonomic Notes

Recording Synonyms