Docker Protip: Use .dockerignore

I learned the hard way. I was trying to build a python container targeted to some NLP jobs … i.e. scraping twitter … and it kept taking 10 minutes … just … to … start … the build. WTF. It would sit there with a message saying “Sending build context to Docker daemon … ” and a counter telling me how many MB it was sending. It was getting close to 2G each time … suspicious.

So I went to google and sure enough, I found an article on Codefresh explaining that every time you run the docker build command it tarball’s the entire working directory and sends it to the docker server. In my case this meant tarballing the many GB of training data I had used for my tool. (hangs head and cries)

Add a .dockerignore to your working directory with a list of patters to ignore and you’ll save yourself a ton of time.

One Liner: Cleaning Up Old Git Branches

I am very lazy about deleting local git branches. I just checked and I have 100+ branches! WTF. That makes it a real pain to find the one I was working on yesterday. So first, let me get rid of the ones that I’ve already merged into master.

git branch --merged | egrep -v '(master|develop)' | xargs git branch 

But I still have 58 branches left! Lets try sorting by committer date:

$ git for-each-ref --sort=committerdate refs/heads/ --format='%(refname:short) "%(committerdate:relative)"'

This command gives me a list of branches sorted by commit date. The output looks something like:

BL-1302567-dnp-global-dns "10 months ago"
BL-1311324-scheduled-scale-down "8 months ago"
hackweek-dnp-deploy "4 months ago"
BL-1305305-dnp-secretsmanager "3 months ago"
BL-1311060-dnp-deploy-v2.6.1 "3 months ago"
dnp-breakfix "3 months ago"
BL-1307167-sas-api-elb "2 months ago"
BL-1307165-patch "9 weeks ago"
BL-1307166-sas-dns "8 weeks ago"
BL-1307441-product-project "8 weeks ago"
sas-deploy-fix "8 weeks ago"
... 

Now I can use this to find branches where the top commit is older than a few months

git for-each-ref --sort=committerdate refs/heads/ --format='%(refname:short) "%(committerdate:relative)"' | egrep -v "(master|develop)" | egrep "([4-9]|[1][0-2]) months ago" | xargs -n2 bash -c 'git branch -D $0'

Here I used egrep to remove the master and develop branches as a precaution, then again to select only branches that are 4 “months ago” or older. Then just pass the output to xargs and you’re all set.

One-liners: Stopping all your docker containers

Let’s say you have multiple docker projects on your laptop and in the course of a day you bounce around between them. It’s easy to forget which ones have been shut down and which haven’t. If you’re using docker-compose this could mean several containers are running and eating your battery. Here’s a simple one-liner to shut them all down.

docker ps -q | xargs docker stop

How to Bind to Eloquent Model Event in Laravel 5

Whenever a Laravel model is modified there are a number of events that fire that allow you to trigger your own action(s). For example, in my PoliticsEQ application I needed to calculate some statistics about keywords and sentiment scores and store them in a separate table to improve performance for some of the front in graphs. The challenge was making sure the keyword statistics table was updated whenever a keyword was added or updated.

Laravel gives you the following events right out of the box: creating, created, updating, updated, saving, saved, deleting, deleted, restoring, restored. You can get the particulars here. Most of them are pretty intuitive.

In my case it made sense to use the “saved” event. To make this happen was remarkably. All I had to do was create a new “Service Provider” and use the Event::saved pattern in the provider’s boot method.

So first I created a KeywordStats provider.


namespace App\Providers;

use Illuminate\Support\ServiceProvider;
use App\Keyword, App\KeywordStat;
use Log;

class KeywordStats extends ServiceProvider
{
    /**
     * Bootstrap the application services.
     *
     * @return void
     */
    public function boot()
    {
        Keyword::saved(function($keyword) {
          $stat = KeywordStat::where('keyword_name', $keyword->name)->first();
          if (!$stat) {
            $stat = new KeywordStat(['keyword_name'=>$keyword->name]);
          }
          $stat->sentiment_avg = Keyword::where('name', $keyword->name)->avg('sentiment_score');
          $stat->total_usages = Keyword::where('name',$keyword->name)->get()->count();
          $stat->save();
          Log::info("Updated {$keyword->name} / avg: {$stat->sentiment_avg} / count: {$stat->total_usages}");
        });
    }

    /**
     * Register the application services.
     *
     * @return void
     */
    public function register()
    {
        
    }
}

Then I simply registered the provider in the config/app.php providers array:


    App\Providers\KeywordStats::class,

Customize an Ubuntu Launcher Icon

Just a little tip for those Ubuntu users who may be looking to customize the launcher presentation of a particular application. If you right click on the application icon while it’s open and choose “Lock to Launcher” a configuration file is created in the /home/user/.local/share/applications directory, but sometimes the default icon is borked because the application is non-standard or is in a non-standard location or has been moved. But you can easily customize the settings by editing the .desktop file. For instance:


#! /home/mike/.local/share/applications/code.desktop
[Desktop Entry]
Encoding=UTF-8
Version=1.0
Type=Application
Name=VSCode
Icon=/home/mike/vscode/resources/app/resources/linux/vscode.png
Path=/home/mike/vscode
Exec=/home/mike/vscode/Code
StartupNotify=false
StartupWMClass=Code
OnlyShowIn=Unity;
X-UnityGenerated=true

Hide WordPress Post from All Queries

Problem: you want to create a variation of a page but you don’t want it to show up on the home page or in any archives or anything. You just need a direct link so you can share it with someone.

Solution:

add_action('pre_get_posts', 'hide_hidden_posts');
function hide_hidden_posts($query) {
  if ( is_admin() ) {
    return $query;
  }

  if ( is_single() AND $query->is_main_query() ) {
    return $query;
  }
  $ids = wp_cache_get('hidden_posts', 'posts');
  if ( !$ids ) {
    global $wpdb;
    $ids = $wpdb->get_col("SELECT post_id FROM {$wpdb->prefix}postmeta WHERE meta_key = 'hide_post'");
    wp_cache_set('hidden_posts', $ids, 'posts');
  }
  $query->set('post__not_in', $ids);
  return $query;
}

This function will modify all WordPress’ frontend queries to exclude any posts with a custom field “hide_post”, except in the case that the query is the main query on a single post page.

Caveat: This will only be functional for plugins and themes using the WP_Query api. Custom queries will not be modified.

Block hacker attacks on WordPress’ xmlrpc.php

It you’re getting a ton of POST requests to your WordPress xmlrpc.php file, here’s a quick way to block all the bad ips via iptables. In my case I’m using nginx and php-fpm, but something very similar would also work for apache.

First, recognize the signature. Your access logs will look something like this:

5.135.68.51 - - [13/May/2015:12:14:59 -0400] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
185.61.138.72 - - [13/May/2015:12:14:59 -0400] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
185.11.147.17 - - [13/May/2015:12:14:59 -0400] "POST /xmlrpc.php HTTP/1.0" 404 168 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
185.11.147.17 - - [13/May/2015:12:14:59 -0400] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
185.62.188.76 - - [13/May/2015:12:15:01 -0400] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
185.62.188.76 - - [13/May/2015:12:15:01 -0400] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
185.62.188.76 - - [13/May/2015:12:15:01 -0400] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
185.61.138.72 - - [13/May/2015:12:15:01 -0400] "POST /xmlrpc.php HTTP/1.0" 404 168 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
5.135.68.51 - - [13/May/2015:12:15:02 -0400] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"
5.135.68.51 - - [13/May/2015:12:15:02 -0400] "POST /xmlrpc.php HTTP/1.0" 404 168 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;  https://www.google.com/bot.html)"

Googlebot will NOT be POSTing your xmlrpc.php like that. Next the trick is to figure out which IP addresses are harassing you. Run this in your terminal:

$> grep xmlrpc /var/log/nginx/access.log | cut -d' ' -f1 | sort | uniq -c | sort -rn | head
  29200 185.11.147.17
  17182 185.62.188.76
  10657 185.61.138.72
   8183 5.135.68.51
   1914 192.227.175.122
   1738 195.154.185.116
   1198 43.252.228.132
    501 205.234.152.218
    155 86.105.212.68
    103 141.138.157.95

Most likely all of these are hackers since it would be unlikely even Jetpack or some other WordPress service would hit your xmlrpc.php that frequently. But you can decide where the cut off should be by adding -n# the the head request above. In my case I chose head -n8 like so:

$> grep xmlrpc /var/log/nginx/access.log | cut -d' ' -f1 | sort | uniq -c | sort -rn | head -n8
  29200 185.11.147.17
  17182 185.62.188.76
  10657 185.61.138.72
   8183 5.135.68.51
   1914 192.227.175.122
   1738 195.154.185.116
   1198 43.252.228.132
    501 205.234.152.218

Sooo …. now you just need to wrap that in a loop that will create the iptable rules to block traffic from the ips:

$> for ip in $(grep xmlrpc /var/log/nginx/access.log | cut -d' ' -f1 | sort | uniq -c | sort -rn | head -n8 | awk '{print $2}'); do iptables -A INPUT -s $ip -j DROP; done

No more hackers.