| 493 | |
| 494 | === LVM repairs === |
| 495 | |
| 496 | This section was written by Frederic BOITEUX. It was titled: "HOW TO LOCATE AND REPAIR BAD BLOCKS ON AN LVM VOLUME". |
| 497 | |
| 498 | Smartd reports an error in a short test : |
| 499 | |
| 500 | {{{ |
| 501 | # smartctl -a /dev/hdb |
| 502 | ... |
| 503 | SMART Self-test log structure revision number 1 |
| 504 | Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error |
| 505 | # 1 Short offline Completed: read failure 90% 66 37383668 |
| 506 | }}} |
| 507 | |
| 508 | So the disk has a bad block located in `LBA block 37383668` |
| 509 | |
| 510 | In which physical partition is the bad block ? |
| 511 | |
| 512 | {{{ |
| 513 | # sfdisk -luS /dev/hdb # or 'fdisk -ul /dev/hdb' |
| 514 | |
| 515 | Disk /dev/hdb: 9729 cylinders, 255 heads, 63 sectors/track |
| 516 | Units = sectors of 512 bytes, counting from 0 |
| 517 | |
| 518 | Device Boot Start End #sectors Id System |
| 519 | /dev/hdb1 63 996029 995967 82 Linux swap / Solaris |
| 520 | /dev/hdb2 * 996030 1188809 192780 83 Linux |
| 521 | /dev/hdb3 1188810 156296384 155107575 8e Linux LVM |
| 522 | /dev/hdb4 0 - 0 0 Empty |
| 523 | }}} |
| 524 | |
| 525 | It's in the `/dev/hdb3` partition, a LVM2 partition. From the LVM2 partition beginning, the bad block has an offset of |
| 526 | {{{ |
| 527 | (37383668 - 1188810) = 36194858 |
| 528 | }}} |
| 529 | |
| 530 | We have to find in which LVM2 logical partition the block belongs to. |
| 531 | |
| 532 | In which logical partition is the bad block ? |
| 533 | |
| 534 | ''IMPORTANT'' : LVM2 can use different schemes dividing its physical partitions to logical ones : linear, striped, contiguous or not... The following example assumes that allocation is linear ! |
| 535 | |
| 536 | The physical partition used by LVM2 is divided in PE (Physical Extent) units of the same size, starting at `pe_start` 512 bytes blocks from the beginning of the physical partition. |
| 537 | |
| 538 | The `pvdisplay` command gives the size of the PE (in KB) of the LVM partition : |
| 539 | |
| 540 | {{{ |
| 541 | # part=/dev/hdb3 ; pvdisplay -c $part | awk -F: '{print $8}' |
| 542 | 4096 |
| 543 | }}} |
| 544 | |
| 545 | To get its size in LBA block size (512 bytes or 0.5 KB), we multiply this number by 2 : `4096 * 2 = 8192 blocks` for each PE. |
| 546 | |
| 547 | To find the offset from the beginning of the physical partition is a bit more difficult : if you have a recent LVM2 version, try : |
| 548 | |
| 549 | {{{ |
| 550 | # pvs -o+pe_start $part |
| 551 | }}} |
| 552 | |
| 553 | Either, you can look in `/etc/lvm/backup` : |
| 554 | {{{ |
| 555 | # grep pe_start $(grep -l $part /etc/lvm/backup/*) |
| 556 | pe_start = 384 |
| 557 | }}} |
| 558 | |
| 559 | Then, we search in which PE is the badblock, calculating the PE rank in which the faulty block of the partition is : `physical partition's bad block number / sizeof(PE)` = |
| 560 | {{{ |
| 561 | 36194858 / 8192 = 4418.3176 |
| 562 | }}} |
| 563 | |
| 564 | So we have to find in which LVM2 logical partition is used the PE number 4418 (count starts from 0) : |
| 565 | {{{ |
| 566 | # lvdisplay --maps |egrep 'Physical|LV Name|Type' |
| 567 | LV Name /dev/WDC80Go/racine |
| 568 | Type linear |
| 569 | Physical volume /dev/hdb3 |
| 570 | Physical extents 0 to 127 |
| 571 | LV Name /dev/WDC80Go/usr |
| 572 | Type linear |
| 573 | Physical volume /dev/hdb3 |
| 574 | Physical extents 128 to 1407 |
| 575 | LV Name /dev/WDC80Go/var |
| 576 | Type linear |
| 577 | Physical volume /dev/hdb3 |
| 578 | Physical extents 1408 to 1663 |
| 579 | LV Name /dev/WDC80Go/tmp |
| 580 | Type linear |
| 581 | Physical volume /dev/hdb3 |
| 582 | Physical extents 1664 to 1791 |
| 583 | LV Name /dev/WDC80Go/home |
| 584 | Type linear |
| 585 | Physical volume /dev/hdb3 |
| 586 | Physical extents 1792 to 3071 |
| 587 | LV Name /dev/WDC80Go/ext1 |
| 588 | Type linear |
| 589 | Physical volume /dev/hdb3 |
| 590 | Physical extents 3072 to 10751 |
| 591 | LV Name /dev/WDC80Go/ext2 |
| 592 | Type linear |
| 593 | Physical volume /dev/hdb3 |
| 594 | Physical extents 10752 to 18932 |
| 595 | }}} |
| 596 | |
| 597 | So the PE #4418 is in the `/dev/WDC80Go/ext1` LVM logical partition. |
| 598 | |
| 599 | Size of logical block of file system on `/dev/WDC80Go/ext1` : |
| 600 | |
| 601 | It's a ext3 fs, so I get it like this : |
| 602 | {{{ |
| 603 | # dumpe2fs /dev/WDC80Go/ext1 | grep 'Block size' |
| 604 | dumpe2fs 1.37 (21-Mar-2005) |
| 605 | Block size: 4096 |
| 606 | }}} |
| 607 | |
| 608 | bad block number for the file system : |
| 609 | |
| 610 | The logical partition begins on `PE 3072` : |
| 611 | {{{ |
| 612 | (# PE's start of partition * sizeof(PE)) + parttion offset[pe_start] = |
| 613 | (3072 * 8192) + 384 = 25166208 |
| 614 | }}} |
| 615 | |
| 616 | 512b block of the physical partition, so the bad block number for the file system is : |
| 617 | {{{ |
| 618 | (36194858 - 25166208) / (sizeof(fs block) / 512) |
| 619 | = 11028650 / (4096 / 512) = 1378581.25 |
| 620 | }}} |
| 621 | |
| 622 | Test of the fs bad block : |
| 623 | {{{ |
| 624 | dd if=/dev/WDC80Go/ext1 of=block1378581 bs=4096 count=1 skip=1378581 |
| 625 | }}} |
| 626 | |
| 627 | If this `dd` command succeeds, without any error message in console or syslog, then the block number calculation is probably wrong ! *Don't* go further, re-check it and if you don't find the error, please renounce ! |
| 628 | |
| 629 | Search / correction follows the same scheme as for simple partitions : |
| 630 | * find possible impacted files with `debugfs` (`icheck <fs block nb>`, then `ncheck <icheck nb>`). |
| 631 | * reallocate bad block writing zeros in it, ''using the fs block size'' : |
| 632 | |
| 633 | {{{ |
| 634 | dd if=/dev/zero of=/dev/WDC80Go/ext1 count=1 bs=4096 seek=1378581 |
| 635 | }}} |
| 636 | |
| 637 | Et voilà ! |
| 638 | |