Monitoring alerted me to a couple of servers which had lost the ability to replicate SYSVOL using FRS. Microsft KB290762 (https://support.microsoft.com/en-us/help/290762/using-the-burflags-registry-key-to-reinitialize-file-replication-service) provided instructions on how to recover.

In summary, on all members of the replication set (in this instance all DCs) stop the ntfrs service using:

net stop ntfrs

The choose one server which will be the authoritative copy and set BurFlags to 0xd4:

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup" /v BurFlags /t REG_DWORD /d 0xd4 /f

On the other servers, set BurFlags to 0xd2:

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup" /v BurFlags /t REG_DWORD /d 0xd2 /f

On the authoritative server, start the ntfrs service and watch for event 13516 in the “File Replication Service” event log. Once that event is logged, start the ntfrs service on the next server and again wait for the 13516 event log. Repeat this on the remaining servers.

EDIT: I have since discovered FRS was being used due to upgrades from Windows 2003. Followed https://blogs.technet.microsoft.com/filecab/2008/02/08/sysvol-migration-series-part-1-introduction-to-the-sysvol-migration-process/ to migrate from FRS to DFSR. In future, if replication breaks it can be reinitialised by following this doc: https://support.microsoft.com/en-ie/help/2218556/how-to-force-an-authoritative-and-non-authoritative-synchronization-for-dfsr-replicated-sysvol-like-d4-d2-for-frs

 

Short bullet points as I may get round to expanding this:

ASRock C2750D4I with 1.35V DRAM modules.

VMware client showed health status as an alert due to low RAM voltage (expected as they are 1.35V but it seems the machine’s BIOS sets the threshold for 1.5V RAM modules)

ESXi 6.0 health indicated VCCM1 (voltage controller for memory 1??) was in a warning state at 1.35V.

 

I found a statically linked copy of ipmitool at https://fattylewis.com/ipmitool-on-esxi-6/ which I uploaded to the ESXi host in question. This worked first time!

./ipmitool chassis status
System Power : on
Power Overload : false
Power Interlock : inactive
Main Power Fault : false
Power Control Fault : false
Power Restore Policy : previous
Last Power Event : ac-failed
Chassis Intrusion : inactive
Front-Panel Lockout : inactive
Drive Fault : false
Cooling/Fan Fault : false

I checked the sensors:

./ipmitool sensor list
....<snipped>...
ATX+5VSB | 5.040 | Volts | ok | 4.050 | 4.260 | 4.500 | 5.490 | 5.760 | 6.030
+3VSB | 3.440 | Volts | ok | 2.660 | 2.800 | 2.960 | 3.620 | 3.800 | 3.980
Vcore1 | 1.050 | Volts | ok | 0.540 | 0.570 | 0.600 | 1.490 | 1.560 | 1.640
Vcore2 | na | Volts | na | 0.540 | 0.570 | 0.600 | 1.490 | 1.560 | 1.640
VCCM1 | 1.350 | Volts | nc | 1.210 | 1.280 | 1.380 | 1.650 | 1.730 | 1.810
VCCM2 | na | Volts | na | 1.210 | 1.280 | 1.380 | 1.650 | 1.730 | 1.810
....<snipped>...

There we have it, the warning is set for 1.38V. This can be changed with IPMI tool:

./ipmitool sensor thresh VCCM1 lnc 1.32

Now we check the result:

./ipmitool sensor list VCCM1|grep VCCM
VCCM1 | 1.340 | Volts | ok | 1.210 | 1.280 | 1.320 | 1.650 | 1.730 | 1.810
VCCM2 | na | Volts | na | 1.210 | 1.280 | 1.380 | 1.650 | 1.730 | 1.810

And the VMware health status got fixed too!

 

 

For now, I will need to run this command after every reboot or reset of the IPMI management controller. I guess I should automate this for system boot-up….

FOLLOW-UP: I have just upgraded the BMC firmware from v0.27.00 to v0.30.00 and the issue seems to be resolved as the VCCM threasholds have been redefined:

./ipmitool sensor|grep VCCM
VCCM | 1.350 | Volts | ok | 1.090 | 1.120 | na | na | 1.720 | 1.750

Oh well, this was an interesting learning experience. Maybe this ipmitool info will help someone else. Note that this is a useful tool as it allows one to configure the management controller from the host while ESXi is running. Useful perhaps to configure the LAN, change the admin user or reset the controller entirely. For example, we can configure the IP address as follows:

./ipmitool lan set 1 ipsrc static
./ipmitool lan set 1 ipaddr 172.16.1.5
./ipmitool lan set 1 netmask 255.255.255.0
./ipmitool lan set 1 defgw ipaddr 172.16.1.1
./ipmitool lan set 1 arp respond on
./ipmitool lan set 1 arp generate on
./ipmitool lan set 1 arp interval 60

Extra info about using ipmitool and controlling the fans on this motherboard at https://blog.chaospixel.com/linux/2016/09/Fan-control-on-Asrock-C2750D4I-C2550D4I-board.html

The Samba module vfs_shadow_copy2 is useful for shares hosted on snapshot capable filesystems/storage. This module allows previous “snapshot” versions of a share to be made visible to users.  This allows for self-service restore of files by end users. On a given share, in smb.conf, you configure something like

[data]
 vfs objects = acl_xattr btrfs shadow_copy2
 path = /btrfs/samba/data
 shadow:basedir = /btrfs/samba/data
 shadow:snapdir = ../data_SNAPS

Depending on your version of Samba, to use the above which includes the “../” link, you might need to add some/all of the below options:

unix extensions = no
wide links = yes
allow insecure wide links = yes

In the 4.3.x and 4.4.x releases there have been a few changes to the vfs_shadow_copy2 module meaning the above three options may or may not have been needed. The jump from 4.4.9 to 4.4.10 addressed Samba BUG 12531. This again involved changes to the vfs_shadow_copy2 module, again breaking the above config (with or without the three options listed). I’ve changed the above config on the share to be

[data]
 vfs objects = acl_xattr btrfs shadow_copy2
 path = /btrfs/samba/data
 shadow:basedir = /btrfs/samba/data
 shadow:snapdir = /btrfs/samba/data_SNAPS

which works without the three options above. Samba 4.4.13, 4.5.8, 4.6.2 also work with the above. Its frustrating since relative (which I assume includes “../”) directories are supposed to be supported with “shadow:snapdir”. Hopefully this configuration now works for all releases going forward.

I find it frustrating that minor releases can break working configs. It makes it difficult to quickly deploy security fixes as the required level of testing is much higher than one would expect for minor releases. This subtle change to the config option is not documented or mentioned in the release notes.  Anyway, more of a rant post than usual… apologies for that!

Upgrading a VM’s VMware hardware version to the latest version is generally considered best practice. This is easy to do via the C# client or the web-client. If you wish to upgrade to a newer, but not latest, hardware version this can be tricky.

As highlighted in https://www.v-front.de/2014/02/how-to-uprade-your-vms-virtual-hardware.html, one easy way to do so is as follows on the ESXi host in question:

vim-cmd vmsvc/upgrade vmid vmx-10

where vmid is determined using something like:

vim-cmd vmsvc/getallvms|awk '{print $1 "  "$2" "$6}'

The appropriate hardware version can be selected from a useful VMware Virtual Machine Hardware Versions KB. The recent relevant versions are:

vmx-13: ESXi 6.5

vmx-12: Workstation Pro 12.x

vmx-11: ESXi 6.0 / Workstation 11.x

vmx-10: ESXi 5.5 / Workstation 10.x

vmx-9: ESXi 5.1 / Workstation 9.x

vmx-8: ESXi 5.0 / Workstation 8.x

vmx-7: ESXi/ESX 4.x

 

 

 

Introduction

While working in a lab environment recently I wanted to vMotion a VM between two ESXi hosts. The vMotion failed, which was not entirely unexpected, due to CPU incompatibilities. These particular ESXi hosts are not in a vSphere cluster so enabling EVC (Enhanced vMotion Compatibility), which would resolve the issue, is not an option.

Attempting to vMotion a VM from host A to host B gave errors about:

  • MOVBE
  • 3D!Now PREFETCH and PREFETCHW

After powering the VM down, migrating the VM to host B and powering on, an attempt to vMotion from host B to host A gave errors about:

  • PCID
  • XSAVE
  • Advanced Vector Extensions (AVX)
  • Half-precision conversion instructions (F16C)
  • Instructions to read and write FS and GS base registers
  • XSAVE SSE State
  • XSAVE YMM State

Manual VM CPUID mask configuration

As the error messages indicate, Enhanced vMotion Compatibility (EVC) would enable the vMotion to take place between mixed CPUs within a cluster. Mixing host CPU models within a vSphere cluster is generally considered bad practice and should be avoided where possible. In this instance, the hosts are not in a cluster so EVC is not even an option.

As indicated above shutting down the VM and doing a cold migration is possible. This issue only relates to the case where I want to be able to migrate running VMs between hosts containing processors with different feature sets.

For the two hosts in question, I know (based on the EVC processor support KB and Intel ARK and VMware KB pages) that the Intel “Westmere” Generation baseline ought to be the highest compatible EVC mode; one of the processors is an Intel Avoton C2750 and the other is an Intel i7-3770S Sandy Bridge. The Avoton falls into the Westmere category for EVC. We will come back to EVC later on.

I suspected it would be possible to create a custom CPU mask to enable the vMotion between these to hosts. In general, features supported by a given processor are exposed via a CPUID instruction call. By default, VMware ESXi manipulates the results of the CPUID instructions executed by a VM as part of the virtualisation process. EVC is used to further manipulate these feature bits to hide CPU features from VMs to ensure that “well behaved VMs” are able to run when migrated between hosts containing processors with different features.

In this instance, “well behaved VMs” refers to VMs running code which use the CPUID instruction to determine available features. If the guest OS or application uses the CPUID instruction to determine available processor features then, when moved via vMotion to a different host, that same set of features will be available. If a guest uses some other mechanism to determine processor feature availability (e.g. based on the processor model name) or merely assumes a given feature will be available then the VM or application may crash or have other unexpected errors.

So back to this experiment. Attempting to go from host A to host B indicated only two feature incompatibilities. I turned to the Intel developers manual (64-ia-32-architectures-software-developer-vol-2a-manual.pdf) for the detail about the CPUID instruction. CPUID called with EAX=0x80000001 results in the PREFETCHW capability being exposed at bit 8 in ECX. Similarly with EAX=0x1, the MOVBE capability is exposed at bit 22 in ECX.

As an initial test, I did a cold migration of the VM to host A and edited the VM settings as shown below.

In summary, this is passing the CPUID result of the host via the above mask filter. The mask filter can hide, set or pass through a CPUID feature bit. In this instance I am hiding the two bits identified above through the use of the “0” in those bit positions. There are other options you can use as displayed in the legend.

I chose “0” rather than “R” as I need to hide the feature from the guest OS and I do not care if the destination host actually has that feature or not.

I saved this configuration and powered on the VM. I was able to successfully perform a vMotion from host A to host B. I was also able to vMotion the VM back to host A.

I performed a vMotion back to host B and powered the VM off. I then powered the VM back on on host B. I tried to vMotion to back to host A, which again failed with the same error as shown above. The reason it failed in the reverse direction is that the VM pickups up it’s masked capabilities at power on and maintains that set of capabilities until it is powered off once more. So by powering on the VM on host B, it got a different set of capabilities to when it was powered on on host A. This explains why when attempting to originally perform the vMotion we had two different sets of errors.

To get the masks to enable a vMotion from host B to host A, I took a look at the developers guide and performed some Googlefoo, I identified the CPUID bits needed to mask the unsupported features:

PCID: CPUID EAX=1, result in ECX bit 17
XSAVE: CPUID EAX=1, result in ECX bit 26
AVX: CPUID EAX=1, result in ECX bit 28
F16C: CPUID EAX=1, result in ECX bit 29

FSGSBASE: CPUID EAX=7, result in EBX bit 00

XSAVE SSE: CPUID EAX=0xd, result in EAX bit 01
XSAVE YMM: CPUID EAX=0xd, result in EAX bit 02 (YMM are 256-bit AVX registers)

The first four are easy as the vSphere client allows one to edit the EAX=1 CPUID results. With the below configuration in place, the vMotion from host B to host A only showed the last three errors (FSGSBASE, XSAVE SSE and XSAVE YMM). This is expected as no masking had been put in place.

To put the masking in place for EAX=0x7 and EAX=0xd we need to edit the virtual machine’s .VMX file. We can do this by editing the .vmx file directly or by using the Configuration Parameters dialogue for the VM under Options/Advanced/General in the VM’s settings dialogue. The following two parameters (first one for FSGSBASE and second for the XSAVE) were added:

cpuid.7.ebx = -------------------------------0
cpuid.d.eax = -----------------------------00-

Powering on the VM succeeded, however the vMotion to host A failed with the same error about FS & GS Base registers (but the XSAVE errors were gone). Surprisingly when I checked the .vmx directly and the cpuid.7.ebx line was missing. For some reason it appears that the VI client does not save this entry. So I removed the VM from the inventory, added that line to the .VMX directly and then re-registered the VM.

I was now able to power on the VM on host B and vMotion back and forth. I was not able to do the same when the VM was powered on on host A. I needed to merge the two sets of capabilities.

At this stage we would have the following in the .vmx file:

for host A -> host B:
cpuid.80000001.ecx = "-----------------------0--------"
cpuid.1.ecx = "---------0----------------------"

for host B -> host A:
cpuid.1.ecx = "--00-0--------0-----------------"
cpuid.7.ebx = "-------------------------------0"
cpuid.d.eax = "-----------------------------00-"

(Note that there are some default entries which get added which are all dashes, and one for cpuid.80000001.edx with dashes and a single H).

We merge our two sets of lines to obtain:

cpuid.80000001.ecx = "-----------------------0--------"
cpuid.1.ecx = "--00-0---0----0-----------------"
cpuid.7.ebx = "-------------------------------0"
cpuid.d.eax = "-----------------------------00-"

At this stage we can now power on the VM on either host and migrate in either direction. Success. Using these four lines of config, we have masked the specific features which vSphere was highlighting as preventing vMotion. It has also shown how we can hide or expose specific CPUID features on a VM by VM basis.

Manual EVC Baseline Configuration

Back to EVC. The default EVC masks can be determined by creating a cluster (even without any hosts) and enabling EVC. You can then see the default masks put in place on the host by EVC. Yes, EVC puts a default mask in place on the hosts in an EVC enabled cluster. The masked off CPU features are then not exposed to the guests at power-on and are not available during vMotion compatibility checks.

The default baselines for Westmere and Sandybridge EVC modes are shown below:

 

The differences are highlighted. Leaf1 (i.e. CPUID with EAX=1) EAX result relates to processor family and stepping information. The three Leaf1 ECX flags relate to AES, XSAVE and TSC-Deadline respectively. The three Leafd EAX flags are for x87, SSE and AVX XSAVE state. The three Leafd ECX flags are related to maximum size needed for the XSAVE area.

Anyway I’ve digressed. So the masks which I created above obviously only dealt with the specific differences between my two processors in question. In order to determine a generic “Westmere” compatible mask on a per VM basis we will start with VMware’s ESXi EVC masks above. The EVC masks are showing which feature bits are hidden (zeros) and which features may be passed through to guests (ones). So we can see which feature bits are hidden in a particular EVC mode. So to convert the above EVC baselines to VM CPUID masks I keep the zeros and change the ones to dashes. I selected dashes instead of ones to ensure that the default guest OS masks and host flags still take effect. We get the following for a VM for Westmere feature flags:

cpuid.80000001.ecx = "0000000000000000000000000000000-"
cpuid.80000001.edx = "00-0-000000-00000000-00000000000"
cpuid.1.ecx = "000000-0-00--000---000-000------"
cpuid.1.edx = "-000-------0-0-------0----------"
cpuid.d.eax = "00000000000000000000000000000000"
cpuid.d.ecx = "00000000000000000000000000000000"
cpuid.d.edx = "00000000000000000000000000000000"

I did not map the cpuid.1.eax flags as I did not want to mess with CPU family/stepping flags. Also, the EVC masks listed did not show the cpuid.7.ebc line I needed for the FSGSBASE feature. Sure enough, using only the 7 lines above meant I could not vMotion from host B to host A. So, adding

cpuid.7.ebx = "-------------------------------0"

to the VMX then allowed the full vMotion I was looking for. The ESXi hypervisor must alter other flags apart from only those shown on the EVC configuration page.

TL;DR

To configure a poor man’s EVC on a VM by VM basis for a Westmere feature set, add the following lines to a VM’s .VMX file.

cpuid.80000001.ecx = "0000000000000000000000000000000-"
cpuid.80000001.edx = "00-0-000000-00000000-00000000000"
cpuid.1.ecx = "000000-0-00--000---000-000------"
cpuid.1.edx = "-000-------0-0-------0----------"
cpuid.7.ebx = "-------------------------------0"
cpuid.d.eax = "00000000000000000000000000000000"
cpuid.d.ecx = "00000000000000000000000000000000"
cpuid.d.edx = "00000000000000000000000000000000"

 

Appendix

1

Useful thread -> https://communities.vmware.com/thread/467303

The above thread covers manipulating the guest CPUID. An interesting option is mentioned in post 9 relating to an option to enable further CPUID manipulation than is possible by default. In my tinkering above, I did not need this option.

monitor_control.enable_fullcpuid = TRUE

Note too that the vmware.log of any VM can be used to see the CPUID information of the host, as also mentioned in post 9:

As for extracting the results, you can write a program to query the CPUID function(s) of interest, or you can just look in the vmware.log file of any VM.  All current VMware hypervisors log all CPUID information for both the host and the guest

Post 13, again a user jmattson (exVMware now at Google), reveals a simple way to configure the processor name visible to guests:

cpuid.brandstring = "whatever you want"

2

This thread https://communities.vmware.com/thread/503236, again involving jmattson, discusses cpuid masks – and gives an insight into how EVC masks interact with the VM cpuid masks.

3

This post https://v-reality.info/2014/08/vsphere-vm-version-impact-available-cpu-instructions/ reveals that the virtual hardware version of a given VM also plays a role in the CPUID mask a VM is given. I found this interesting as it does give us another reason to actively upgrade the hardware versions of VMs.

4

A little gem is mentioned at the bottom of https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1029785 – it seems that IBM changed the BIOS default for some of their servers. The option was changed to enable the AES feature by default. This resulted in identical servers configured with BIOS defaults, which were added to a vSphere cluster, having two different sets of CPUID feature bits set (AES enabled on some and disabled on others) resulting in vMotions not being possible between all servers.

I was doing a clear out of some paper work and came across these write performance tests which I did a few years back on an HP DL380 G5 with a p400 RAID controller.

  • 4 drive RAID10 = 144MB/s (~72MB/s per drive)
  • 4 drive RAID0 = 293MB/s (~73MB/s per drive)
  • 8 drive RAID0 = 518MB/s (~64MB/s per drive)
  • 8 drive RAID5 = 266MB/s (~38MB/s per drive)
  • 8 drive RAID6 = 165MB/s (~28MB/s per drive)
  • 8 drive RAID10 = 289MB/s (~72MB/s per drive)

The “per drive” is data being written per active data drive, excluding any RAID overheads.

I inferred from the above that the p400 RAID controller maxes out at around 518MB/s as it is unable to saturate 8 drives in a RAID0 array (64MB/s vs 72MB/s per drive). Not sure if this is a controller throughput or PCIe bus limitation.

Eitherway, this testing was (I think) done on Ubuntu 10.x or 12.x with pretty bog standard settings using a simple command such as:

dd if=/dev/zero of=/dev/cciss/c0d0 bs=1024k

I thought I’d capture these figures here since I have no where else to save them.

Just a reminder that there are reserved DNS domain names and reserved IP addresses for documentation purposes. These are detailed in RFC2606 and RFC5737 respectively.

The reserved DNS top level domain (TLD) names are:

  • .test
  • .example
  • .invalid
  • .localhost

.test is reserved and recommended for testing purposes within organisations. .example is reserved for use within documentation which requires example DNS names. .invalid should not be configured within resolvers and can be used when invalid DNS names are required for testing purposes. .localhost is traditionally been defined with an A record pointing at the locahost IP address of 127.0.0.1.

There are reserved second level domain names for documentation purposes of

  • example.com
  • example.net
  • example.org

The following three address ranges have been reserved for use within documentation where example IP addresses are required:

  • 192.0.2.0/24 (TEST-NET-1)
  • 198.51.100.0/24 (TEST-NET-2)
  • 203.0.113.0/24 (TEST-NET-3)

 

There are various posts relating to issues with VMware Workstation and the use of SATA physical drives (i.e. passing a physical SATA drive through to the guest VM).

The first challenge is getting past the “Internal Error” error message. To do so, create a VM with a SATA virtual disk. Once you’ve done this, you can try and add a SATA physical drive to the guest. This needs to be as a SATA device, since adding the pass-through drive as a SCSI device works. You will receive the “Internal Error” error message. Note that the .vmdk file is created for the drive in the VM’s directory.

The next step is to edit the .vmx and replace the original SATA device (sata0:0.fileName= line) with the newly created .vmdk file. This will get the SATA pass-through device into the VM. However, I was not able to power on the VM at this stage and got another error message.

Looking in the VM’s log file it was apparent that VMware Workstation was unable to open the raw device,
\\?\Volume{someGUID}

The fix to this is to run VMware Workstation as administrator. So instead of double clicking as you normally would, you need to right click and select “Run as administrator”. This was the step that I did not see mentioned anywhere else.

By doing this, I was able to start the VM and it then worked as expected!

This is just a short post about an annoying issue I encountered today while updating my automated Ubuntu installer with Ubuntu 14.04 (Trusty Tahr). I have a PXE based network boot process to automatically install and configure Ubuntu server instances. The server to be installed will PXE boot and then get the installation files and preseed configuration file via HTTP. This worked well for prior LTS releases.  I don’t bother with automating non-LTS releases as their lifespan is far too short for “production” use.

I updated the install server with the new 14.04 server images (AMD64 and i386). I then updated my standard preseed configuration file with the new paths to 14.04 and set off a server installation. Unfortunately, not long into the installation process an error message was displayed. “Install the system” was the title of message and the error was “Installation step failed” “An installation step failed. You can try to run the failing item again from the menu, or skip it and choose something else. The failing step is: Install the system”.

Error during a netboot install of Ubuntu 14.04
Error during a netboot install of Ubuntu 14.04

Not terribly useful, if I say so myself. Looking at the installation log on VTY-4 (accessed via ALT-F4 on the console), I saw messages about “main menu: INFO: Menu item ‘live-installer’ selected'” followed by “base-installer: error: Could not find any live images”. Again, not very useful.

To cut a long story short, after much time using Google, I found the solution. The way base Ubuntu is installed seems to have changed with Ubuntu 12.10 (Quantal Quetzal). It seems that rather than installing individual packages initially, a base preconfigured file-system is deployed. This is now contained in a file called “filesystem.squashfs” which is located at “/install/filesystem.squashfs” on the installation media. It seems that when installing via the network (in some situations), you need to configure the preseed file to use this “default” filesystem from the network. This is done in your preseed file by adding the “d-i live-installer/net-image” option, such as in the following line:

d-i live-installer/net-image string http://10.1.1.2/trusty-server-amd64/install/filesystem.squashfs

where 10.1.1.2 is your network installation server and /trusty-server-amd64 is the location of the installation media on the network installation webserver.

Once that is in place, you’re good to go! As I said before, this is only necessary since Ubuntu 12.10. As a result, all of those upgrading our installations from 12.04 LTS to 14.04 LTS may need to be aware of this. There is surprisingly little reference to this on the Internet. Do not many people install over the network in isolated install networks?

 

It is not often that I come across a tool which does something I’ve been trying to do for a long time. In this instance the tool is for PDF manipulation. This blog entry is primarily a reminder to myself of how to do this, but hopefully it will help someone else. Scanning documents and converting them to PDF files is fairly simple these days thanks to freebies such as Foxit Reader. However, for anything more complicated (page insertion, deletion, rotation, etc) I’ve always run into problems finding free tools. I do own an old copy of Adobe Acrobat, which can do many of these functions but it is on an old computer and I don’t seem to be able to move it thanks to Adobe’s licensing regime. In the past I’ve managed to find a variety of tools to do these one off PDF manipulations, say a merge of two documents, but no real “workhorse” tool or programme which is worth keeping around. However – today I found a real gem!

Some background. I’ve started scanning a variety of documents rather than keeping paper copies. Single pages are typically no problem, as described above. However, today I had the need to scan an A5 booklet formed of A4 sheets. Each printed portrait A5 page was half of a landscape A4 sheet. The page numbers of the booklet were as follows:

A4 sheet1:  A5 pages: 12 / 1 (reverse side: 2 / 11)
A4 sheet2:  A5 pages: 10 / 3 (reverse side: 4 / 9)
A4 sheet3:  A5 pages: 8 / 5 (reverse side: 6 / 7)

In the past, I tried to solve this type of task by cutting the A4 sheets in half and then scanning the resultant A5 sheets. This worked in a fashion but the scanner had reliability issues when feeding the A5 sheets. I figured there must be a better way. Turns out there is, and with a bit of Googling I found this.

Three free tools can accomplish what I want. These tools obviously need to be installed/available on your computer.

  • Foxit Reader – for the scanning of the document (any “scan to PDF” tool will be fine for this)
  • Briss – for cropping the scanned A4 PDF pages into A5 PDF pages
  • PDFtk – for rotating and reordering the PDF file’s pages – PDFtk Server (the command line tool) is described herein

PDFtk is the real gem which I found today. Super powerful and free!

The process to scan and process such a booklet and get a usable resulting PDF is as follows.

Remove any staples from the booklet and check the pages remain in order

This is fairly straight forward. Check the pages remain in order and that they will pass through the scanner without issues. Check for any wrinkles or bent corners. The key is to ensure that the pages scan as smoothly and repeatedly as possible.

Scan the double-sided A4 sheets

Using a typical tool, such as Foxit Reader, scan the double-sided A4 sheets, into a PDF titled ff1.pdf. I ended up with alternating upside down, rightside up sheets. So the PDF pages were as:

PDF page 1: pages 12/1 upside down
PDF page 2: pages 2/11 rightside up
PDF page 3: pages 10/3 upside down
PDF page 4: pages 4/9 rightside up
PDF page 5: pages 8/5 upside down
PDF page 6: pages 6/7 rightside up

Rotate the upside down pages

We need to get all the pages in the PDF file to be the correct orientation. This is a breeze with PDFtk. In my case, I used the following command line:

pdftk ff1.pdf cat 1south 2 3south 4 5south 6 output ff2.pdf

Reading the documentation further, I discovered that one could instead use:

pdftk ff1.pdf rotate oddsouth output ff2.pdf

This command rotates pages 1, 3 and 5  by 180 degrees and outputs the resulting PDF to ff2.pdf. We now have a PDF with the scanned pages correctly orientated but each PDF page consists of two A5 sheets:

PDF page 1 : pages 12/1
PDF page 2: pages 2/11
PDF page 3: pages 10/3
PDF page 4: pages 4/9
etc

Crop each A4 PDF page into a pair of A5 PDF pages

This is where the Briss tool works its magic. Briss enables PDF files to be cropped. In this case we want two crop regions on each PDF page. So we load ff2.pdf into Briss. We then define two crop areas on each page. Note that Briss overlays all even and all odd numbered pages so that only two crop definitions are required for multi-page PDFs. A single crop area is displayed by default for both the even and odd stacked pages. A second crop area can be created by clicking and dragging on the stacked pages. Carefully define two similar sized crop areas over the pair of A5 pages on each displayed A4 stack. Once the areas are defined generate the new PDF, ff3.pdf

Reorder the PDF pages

The resultant PDF, ff3.pdf, should now have all the pages as A5 looking pages but they will be out of order. In my case the PDF pages contained the following booklet page order: 12, 1, 2, 11, 10, 3, 4, 9, 8, 5, 6, 7

We turn again to PDFtk and run the following command:

pdftk ff3.pdf cat 2 3 6 7 10 11 12 9 8 5 4 1 output ff4.pdf

This creates a PDF file, ff4.pdf, with reordered pages. The first page in the new PDF was the second from the input PDF and so on.

Enjoy the completed PDF

We now have a completed PDF containing the individual pages from the booklet, all correctly ordered and rotated. A little bit of work, sure. But much easier than manually trying to scan each individual page.

 

A further comment. I think the PDFtk Server tool is fabulous. It is a command line tool and I can see myself returning to it time and again. It is seriously powerful with a vast array of options. I am sorry and amazed that I’ve not come across it before. There is a free GUI version available which isn’t as powerful and a paid-for GUI with a similar feature set.