Sunday, April 23, 2017

Cadbury creme eggs ingredients

Cadbury's UK (as bought in Ireland):
Milk chocolate: Milk solids 14% minimum. Contains vegetable fats in addition to cocoa butter. Milk chocolate egg with a soft fondant centre (47%). Ingredients: Sugar, milk, glucose syrup, cocoa butter, invert sugar syrup, dried whey (from milk), cocoa mass, vegetable fats (palm, shea), emulsifier (E442), dried egg white, flavourings, colour (paprika extract).

Cadbury's US ingredients (as produced by Hershey's under license from Cadbury):
Milk Chocolate: (sugar, milk, chocolate, cocoa butter, milk fat, nonfat milk, soy lecethin, natural and artificial flavors), sugar, corn syrup, high fructose corn syrup, contains 2% or less of: artificial color (Yellow #6), artificial flavor, calcium chloride, egg whites.

These ingredients are from Easter 2017, so after Cadbury changed from using dairy milk chocolate in the UK.

Tasting them side by side, I was a bit surprised to find a significant difference. The UK eggs were richer and creamier, with a better flavor. The US eggs just tasted too sugary by comparison, and the fondant was more translucent, like sugar syrup.

Friday, March 10, 2017

Preventing Windows OS from sleeping while your python code runs

Do you have a python script that you want to run through to completion, but might take several hours without user interaction?

Might you run on a laptop or other Windows computer that has power management enabled, so that it might go to sleep or hibernate when not being used?

If you do nothing, windows will likely sleep or hibernate before your script can complete.

The following simple piece of code can prevent this problem. When used, it will ask windows not to sleep while the script runs. (In some cases, such as when the battery is running out, Windows will ignore your request.)

class WindowsInhibitor:
    '''Prevent OS sleep/hibernate in windows; code from:
    API documentation:'''
    ES_CONTINUOUS = 0x80000000
    ES_SYSTEM_REQUIRED = 0x00000001

    def __init__(self):

    def inhibit(self):
        import ctypes
        print("Preventing Windows from going to sleep")
            WindowsInhibitor.ES_CONTINUOUS | \

    def uninhibit(self):
        import ctypes
        print("Allowing Windows to go to sleep")

To run it, simply:

import os

osSleep = None
# in Windows, prevent the OS from sleeping while we run
if == 'nt':
    osSleep = WindowsInhibitor()

# do slow stuff

if osSleep:

It is based on code from here, which also has code for preventing suspension on Linux under GNOME and KDE, should you need that.

Thursday, March 2, 2017

ufsd NTFS driver on mount: Fixing 'Unknown Error 1000'

I've previously mentioned using the ufsd driver for NTFS or HFS+, because it is significantly faster at writing than the default ntfs-3g driver provided with in Linux, and it supports writing for HFS+ drives even with journaling enabled. (It is available free for non-commercial use).

But what happens if you have a hard power down with an NTFS drive mounted? Or you encounter corruption for whatever reason?

UFSD may give this error upon mounting it for read-write:

mount: Unknown error 1000

This error is because the "dirty" flag is set on the drive and ufsd won't mount it read/write for fear of corrupting it. In many simple cases, you can correct errors in Linux with ntfsfix, from the ntfsprogs package, and also use ntfsfix to clear the dirty flag.

So if the volume in question is /dev/sdb1, I could do the following as root while the drive is unmounted. The first command repairs simple issues. The second clears the dirty flag:

root ~ # ntfsfix /dev/sdb1   
root ~ # ntfsfix -d /dev/sdb1

IMPORTANT NOTE: If you have extremely valuable data, especially that was being written when the failures occurred, you should be very cautious, because these commands may cause you to lose data.

A better approach may be to load Windows and run chkdsk /f  (to fix file system errors) or chkdsk /r (to detect and mark bad sectors) on the offending drive. Chkdsk is much more sophisticated at detecting and fixing errors than anything available in Linux, although it too may cause you to lose data (and some data loss may be inevitable if power goes off while writing). Here is one approach (using a bootable CD) to using chkdsk even if you don't have Windows installed or handy.

Final note, if this drive is not critical for your machine and you mount it from fstab, you can add the errors=remount-ro flag to the fstab mount line, in order to avoid hanging up your boot when things go wrong.

Saturday, January 28, 2017

Reinvigorating an old laptop with fresh Windows 7 and a new SSHD

I've got an older laptop running Windows 7 and it was getting bogged down by cruft after more than 4 years.

Windows update literally took over a week to check for updates last time I checked (pegging one of the processors that whole time). The lazy route would be to buy a new computer, and I looked into that. But I'm not happy with my options (long story), and otherwise, this laptop is great.

So I decided it was time for a hard drive upgrade for added speed and extra space, and re-install of the OS. As a benchmark, it took 3:30 to boot before I started (ouch!), with a 7200rpm hard drive.

After installing a fresh FireCuda 2TB hybrid SSHD drive at a very reasonable price (also in 1TB capacity), it was time to build the most lean Windows I could. (In my machine installing the drive was as easy as literally unscrewing one screw and swapping them out; google for your laptop's "service manual" if you're unsure of the procedure). If you're wondering why not install Windows 10, see below.

Start with the laptop's recovery media -- reinstall as it came, with Windows 7 pre-SP1. In my case, most drivers were not installed, and neither was Internet Explorer. I needed to manually copy over a Firefox installer in order to get going. And I attached an ethernet cable for internet, since ethernet worked out of the box (unlike wifi).

One of the main keys to keeping your install fast is to install as many updates as possible in as few steps as possible. The problem I had with windows update previously was because Windows Update performs a brute-force comparison of available updates against installed updates, and this gets massively slower as the number of installed updates increases. Here is what I did to keep things slim (links are for 64-bit Windows 7):
  1. Delete unnecessary and obsolete programs that came with your computer image (using uninstall tool, or removing the installers for things that you won't install).
  2. Install Windows 7 Service Pack 1 (KB976932) (if your image was from pre-SP1).
  3. Install the latest .NET framework (4.6.2 as of this writing). 
  4. Install IE11.
  5. Update Windows Update in order to install the rollups below (KB3020369)
  6.  * NOTE, I did not do this, but at this point it may be wise to install the "enterprise hotfix rollup" (KB2775511) and associated fixes mentioned at the bottom of this article, in order to avoid even more updates later *
  7. Install the rollup including almost all fixes to Windows from SP1 up until May 2016 (KB3125574) - this is "almost SP2"
  8. Google to install the latest "Security Monthly Quality Rollup" released either this month or last. For me I installed the January 2017 rollup (KB3212646) (you could install this even if you're doing it later; windows update can take it from here).
  9. Now install the latest drivers for your machine from the vendor's website. (If you have a Lenovo laptop like me, install only their System Update tool, which may also require you to install the .NET framework first, and then use it to update all of your drivers at once).
  10. Install Microsoft Security Essentials (or another antivirus software)
  11. Perform a few cycles of Windows Update and reboot; you'll still have maybe 40-100 security and optional updates.
  12. Delete installation files and do a Disk Cleanup (as administrator) to remove backups.
  13. Clone the machine from here to be able to recover more quickly next time, starting from this point. I used EASEUS TODO backupEASEUS disk copy is another potential option.
And you're set. Reinstall the software you use, and copy your data back on. 

For comparison's sake, I improved from booting in 3:30 to booting in 45 seconds, an almost 80% reduction (this is AFTER I re-installed all similar software). Nice!

* So why not Windows 10? A few reasons -- one I'm completely happy with Windows 7. Don't fix what ain't broke. Next, I don't like that with windows 10 I'm at the mercy of upgrades from microsoft that I might not want but cannot decline, and which may break things. I also don't want many of the new features. And finally, I don't like the serious lack of privacy in windows 10 -- microsoft sends lots of data from your computer all the time. It's possible to lock it down somewhat, but it's tricky and always shifting (see above updates). Thanks but no thanks. Plus I missed the free upgrade window and definitely don't want to pay.

** If you buy the same great hard drive that I did from the link above, I'll get a small commission at no charge to you. Win-win!

Tuesday, January 10, 2017

Adjusted ADA scores from 1947-2015

Below you will find updated Americans for Democratic Action (ADA) scores covering the period 1947 to 2015 (the latest available). They are based upon ADA scores of selected congressional vote records independently tabulated by Groseclose, Levitt, and Snyder (1999) (for 1947-1998 originally and later extended by Groseclose to 2008) and Anderson and Habel (2009) (for 1947-2007), and have been updated and reconciled by myself (2008-2015). They are adjusted using the improved procedure from Dr. Groseclose, still based upon his original paper.

The final (adjusted) data are based upon the Anderson collection (mainly because when I started work that was what I had accessible). I correct over 150 mislabeled records that do not match valid ICPSR records as maintained by Keith Poole (as augmented with preliminary records for the 114th congress). Errors were often due to changes in seats, especially to people with similar names, or changes in party of the congressmen.

For the period 1990-2008, I additionally hand-corrected all discrepancies between the Anderson and Groseclose data. In some cases, records were missing from one source or the other. In others the scores were incorrectly transcribed or identifying data were incorrect. In a few cases the source data were ambiguous - in some years a score is provided even if the member served for less than half the eligible votes and occasionally the score does not match the recorded votes. Where possible, I trust the recorded votes over the scores, and omit any congressmen ineligible or deceased for more than half the votes. I also generally counted absences the same as negative votes, even if the congressman served a partial term (since this is ADA's general practice), although for 2007-2015 I omitted anyone who missed more than 6 votes (as I learned this was the practice of Groseclose et al.). There were around 380 such discrepancies and 150+ corrections. Given this work, the period since 1990 can be trusted to be of high accuracy. Earlier data still have discrepancies, and data prior to 1972 show a large number of discrepancies, which I highlight in the excel files provided below. These seem to largely be due to different policies for how to treat votes in years before a numerical score was assigned by ADA. I leave further correction / homogenization for future work.

I calculate the scores using two base years for the adjustment -- 1980 (as used in both the above papers), and 1999 (as used as the Political Quotient [PQ] in Dr. Groseclose's book Left Turn, which he chose because empirically that base year gives the average congressman in the 2000's an adjusted score near 50). If you would prefer a different base year, the code to produce it is below as well.


Adjusted scores with 1980 base year (.xlsx file)
Adjusted scores with 1999 base year (.xlsx file)
Raw data and code to reproduce (including raw output files and parameters) (.zip file)

Citations: The original papers above.

Wednesday, November 30, 2016

Simplified Fonts for ggplot2 in R

I struggled for a while to get fonts to work properly with ggplot2 charts in R under Windows. The solution turned out to be easier than it seemed. The "old" way was to use library("extrafonts") which would then scan your entire fonts directory each run (slowly). Then if you got that working and you wanted to export a chart to a PDF, say, you'd need to install Ghostscript and embed the fonts subsequent to generating it. Nowadays R can do it all internally, and with a bit of setup, not have to scan the fonts at all. That's thanks to the showtext and cairo libraries.

You just have to find the filename of the font you want from your fonts directory. (In windows, open control panel -> fonts, then view details and  you may need to add a "Font File name" column". If no name appears, it may be a grouping; open the grouping and do the same.) In my case I wanted to use the Perpetua font, which has the name PER_____.ttf.

install.packages("showtext") # once
install.packages("Cairo" # once
library("Cairo") # for embedding fonts in PDF; may not need to be loaded here

# add the desired font to the font database (you can add multiple)
font.add("perpetua", "PER_____.ttf")

# the following should only be necessary in windows, and often isn't documented
# for each font you add, do this, mapping the Windows name and type to a font family
# variable (Perpetua in this case) that you will refer to it as.
windowsFonts(Perpetua=windowsFont("TT Perpetua"))

# plot something
# and use perpetua font for text (by default - any text can be customized)
qplot(1:10) +
  + theme(text = element_text(family="Perpetua")) 

# save to file; using Cairo drivers to embed the fonts as needed
ggsave("mychart.eps", width=6.5, height=5.5, device=cairo_ps)
ggsave("mychart.pdf", width=6.5, height=5.5, device=cairo_pdf)

Note that you shouldn't need any special driver to save as an image file (jpg/png/...). I have encountered a few fonts that don't seem to embed correctly and I'm not sure why that is at the moment, but most fonts seem to work fine with this method; they are viewable on screen and in PDFs. This procedure should theoretically work cross-platform (except that the windowsFonts call will not be needed), which is another advantage to this method, although I have not yet tested this.

You can also use google fonts like so, so you don't even have to find one on your system:"Roboto", "roboto")

There is more about the showtext library here. Hope this helps you. Be sure to leave comments if you find any improvements to this method.

Thursday, September 22, 2016

Notes on using git and github

Everybody knows about github and what an amazing resource it is. This post is about using git with github.

Fork what you're interested in on github. In the following user is your username, repository is the repository you forked.
git config --global core.editor <your_favorite_editor>

You may want to modify the EOL settings to match your preferences and platform.

git clone
cd repository
git config "user"
git config ""
git remote add upstream

To integrate new changes from the upstream repo into yours:

git fetch upstream
git rebase -i upstream/master

If there are redundant commits, 'squash' them in the first "interactive" message. The -i is important; otherwise you'll get stuck with a bunch of redundant commits.

If there are any conflicts (changes to the same general area of code, even if just adjacent, or even if equivalent), you will be dropped to the command prompt commit by commit. Edit the conflicting file as it should be, removing any >>>>> or <<<<<. When satisfied, git add thefile and git rebase --continue. They will be applied in a new commit.

Similarly if you want to create a new branch for your own use:

git checkout -b branch

To switch branches at any time (without uncommitted changes), just omit the -b. If you want to instead create the branch at a previous commit, add that branch hash at the end.

Be sure to push any changes to the master branch to github before changing to other local branches. Then pull from github before rebasing.

Then you can rebase as above with the other branch. But if the other branch is pushed to github, it is dangerous to push if anyone else is using it. If you're sure they're not, you can git push --force while on that branch (if git config --global push.default simple). If there were any other users there, after pulling, they'd have to blow away their unpushed local commits with git reset --hard origin/branch.

Here is more on rebasing and merging:

Or you can move just one commit to the current branch by using:
git cherry-pick <hashcode>

If you need to clean things up:

If you need to "undo" a change made on github, first pull to update your local repo. Then:
git reset --soft HEAD^, and 
git push origin +branchName (see caveats). 

About reverting, resetting etc, see:

To update an existing remote branch from a local branch (which is currently checked out):
git push origin local_branch_name:remote_branch_name
or if the branch names match, do this and it will work in the future too:
git push --set-upstream origin local_branch_name

All in all my experience is that git is vastly inferior to mercurial (hg); git is far more finicky, harder to use and more prone to ugliness, and plus mercurial has nice GUIs from TortoiseHg. All in all, git feels like an advanced patch manager that has morphed into a version control system while mercurial feels like an advanced version control system. But alas, the linux kernel uses git and thus we have github and the rest is history. But I still use mercurial whenever I have a choice.