dd flag - choosing appropriate block size

D. Hugh Redelmeier hugh-pmF8o41NoarQT0dZR+AlfA at public.gmane.org
Sun Sep 7 19:09:18 UTC 2008


| From: William Muriithi <william.muriithi-Re5JQEeQqe8AvxtiuMwx3w at public.gmane.org>

|  "Note:
| Care has to be taken when specifying the block size bs as exact
| multiple of the physical size of the device because improper block
| size will result in data inconsistency, or overlap."
| 
| http://64.233.169.104/search?q=cache:88BW3Dok8NUJ:publib.boulder.ibm.com/infocenter/systems/topic/com.ibm.aix.cmds/doc/aixcmds2/dd.htm+dd+bs%3D*&hl=en&ct=clnk&cd=12&gl=ca
| 
| Problem now is, you may not always know the physical block size of the
| device and when you have that information, arriving at the best
| multiple is not well explained.  Would it be better to not specify the
| bs when one is not sure?

That is a google cache page of an AIX manpage.  I don't blame you for
not reading the pathetic Linux (GNU) manpage.

Blocksize only matters for "special files": rawish I/O devices.  Most
other I/O goes through the buffer cache and thus essentially
eliminates any blocksize effect other than performance.

Performance: the larger the block size, the fewer system calls that
are made.  bs=1 means there will be a couple of system calls for each
byte.  bs=1024 will cut that down by a factor of 1024.  But you can
take that too far, tying up too much RAM in the dd process.

Disks these days have a raw blocksize of 512 bytes (if I remember
correctly).  Linux filesystems often impose a larger blocksizes in some
sense (1k or 4k are typical).  Possible senses: allocation units,
transfer units, "extents".

In really raw devices, blocksize matters.  In UNIX, there were /dev
entries for raw disk drives (they were "character" devices) but Linux
doesn't do that.  I think all disk accesses go through the buffer
cache.

Blocksize matters in some kinds of tape drives.  Hardly anyone uses them
anymore.

Overlap is tricky.  But most people never overlap copies.  Why would
you?  I'm assuming I understand what is meant by overlap:
	dd if=/dev/sda1 of=/dev/sda1 skip=1
or
	dd if=/dev/sda1 of=/dev/sda1 seek=1
This doesn't seem very useful.
Accidental case:
	dd if=/dev/sda of=/dev/sda1
Again, not useful.

In most cases, bs=10k or bs=100k ought to be harmless and might improve 
performance.  I have not tested if this results in observable improvements 
on current hardware.  An exercise for the reader.
--
The Toronto Linux Users Group.      Meetings: http://gtalug.org/
TLUG requests: Linux topics, No HTML, wrap text below 80 columns
How to UNSUBSCRIBE: http://gtalug.org/wiki/Mailing_lists





More information about the Legacy mailing list