| Author |
Message |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 21 Oct 2009 08:48:03
|
Venkat_B
Member
Joined: 16 Jun 2008 08:39:02
Messages: 45
Offline
|
Hi,
What would be the reason for such errors:
NOTICE: -20000could not read block 11079 of relation 1228638/406241/1743636: read only 0 of 8192 bytes
Is this because of any corrupted data inserted or bad block with hardware.
Is there any way to resolve this kind of error. This is the problem getting while trying to retrieve data.
Thanks,
Venkat.
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 21 Oct 2009 08:50:55
|
scottiebo
Senior member
![[Avatar]](/images/avatar/2d0122e6c17cdb89f8eed4d75b5f5eef.png)
Joined: 23 Oct 2005 21:05:02
Messages: 197
Offline
|
Venkat_B wrote:Hi,
What would be the reason for such errors:
NOTICE: -20000could not read block 11079 of relation 1228638/406241/1743636: read only 0 of 8192 bytes
Is this because of any corrupted data inserted or bad block with hardware.
This is most usually seen when you have some kind of problem at the hardware level.
Is there any way to resolve this kind of error. This is the problem getting while trying to retrieve data.
Thanks,
Venkat.
Well, what operation were you doing when this happened? A select statement?
|
--Scottie |
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 22 Oct 2009 01:40:27
|
sam_jass
Member
Joined: 22 Oct 2009 01:22:55
Messages: 41
Offline
|
Hi Venkat,
This is most probably because of hardware issue. However, sometimes it happened because of overflow of data on the same
disk or the disk is in read only mode. tRY to check it out with your SYS. admin.
sam jas
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 22 Oct 2009 02:56:02
|
Venkat_B
Member
Joined: 16 Jun 2008 08:39:02
Messages: 45
Offline
|
Hi Scottiebo and Sam,
Thanks for your reply. I am running just count(*) when i got this error:
(select count(*) from test_table where columnid between 10000 and 20000;)
Is dropping the entire table and reloading is the only solution or do we have any other?
Because if i have million of rows with table and i have this error ... then even reloading data will be hard.
If it is like problem with hardware(disk in read-only mode) then it not suppose to retrieve data from other tables also.But we are able to retrieve data from other tables which are stored in same disk.
Thanks,
Venkat
This message was edited 1 time. Last update was at 22 Oct 2009 03:00:38
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 22 Oct 2009 09:33:15
|
sam_jass
Member
Joined: 22 Oct 2009 01:22:55
Messages: 41
Offline
|
I am running just count(*) when i got this error:
(select count(*) from test_table where columnid between 10000 and 20000;)
Is dropping the entire table and reloading is the only solution or do we have any other?
Because if i have million of rows with table and i have this error ... then even reloading data will be hard.
>> Have you configured HA solution in your environment ? Do you have a valid backup ?
If it is like problem with hardware(disk in read-only mode) then it not suppose to retrieve data from other tables also.But we are able to retrieve data from other tables which are stored in same disk.
>> then it is not the issue with whole disk. There may be some blocks on the disk is corrupted.
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 22 Oct 2009 10:56:39
|
scottiebo
Senior member
![[Avatar]](/images/avatar/2d0122e6c17cdb89f8eed4d75b5f5eef.png)
Joined: 23 Oct 2005 21:05:02
Messages: 197
Offline
|
sam_jass wrote:I am running just count(*) when i got this error:
(select count(*) from test_table where columnid between 10000 and 20000;)
Is dropping the entire table and reloading is the only solution or do we have any other?
Because if i have million of rows with table and i have this error ... then even reloading data will be hard.
I would reload from a backup into a new table: create table test_table_load...
Then, as far as finding data that may be recoverable, you're going to have to scan for different ranges of data looking for where your bad blocks are.
If it is like problem with hardware(disk in read-only mode) then it not suppose to retrieve data from other tables also.But we are able to retrieve data from other tables which are stored in same disk.
>> then it is not the issue with whole disk. There may be some blocks on the disk is corrupted.
+1 It's very a-typical that you lose a whole disk, usually just certain blocks at a time.
|
--Scottie |
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 23 Oct 2009 05:52:17
|
Venkat_B
Member
Joined: 16 Jun 2008 08:39:02
Messages: 45
Offline
|
Have you configured HA solution in your environment ?
Can you please explain or give me some knowledge about this HA solution.... What exactly is this as i have no idea about HA solution.
what my doubt is now we are getting error while querying between some range. What is the solution if i have the same with just select count(*) from test_table;
How can i resolve if we found the particular range/records which is causing problem by checking with different ranges.
Is the only solution to load table from backup / direct reloading or do we have any other possibilities where we can resolve the issue like moving the bad blocks data to other blocks....etc.,
---Venkat
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 23 Oct 2009 07:25:00
|
scottiebo
Senior member
![[Avatar]](/images/avatar/2d0122e6c17cdb89f8eed4d75b5f5eef.png)
Joined: 23 Oct 2005 21:05:02
Messages: 197
Offline
|
Venkat_B wrote:
Have you configured HA solution in your environment ?
Can you please explain or give me some knowledge about this HA solution.... What exactly is this as i have no idea about HA solution.
[/qote]
If you had configured a PITR (point in time recovery mechanism) this would be much easier to do as we could "roll back" to the point in time just before the corruption.
what my doubt is now we are getting error while querying between some range. What is the solution if i have the same with just select count(*) from test_table;
You may only have a bad block in the range that you specified before. However it is likely that more relations are indeed damaged.
How can i resolve if we found the particular range/records which is causing problem by checking with different ranges.
Write a script to go through the table in ranges until you encounter an error.
i.e.
No error? Good:
etc...
Is the only solution to load table from backup / direct reloading or do we have any other possibilities where we can resolve the issue like moving the bad blocks data to other blocks....etc.,
---Venkat
Reloading is the best way.
If you want emergency recovery services (something EnterpriseDB has done in the past) I recommend emailing 'sales@enterprisedb.com'
Good luck
|
--Scottie |
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 23 Oct 2009 07:32:03
|
sam_jass
Member
Joined: 22 Oct 2009 01:22:55
Messages: 41
Offline
|
Hi Venkat,
Can you please explain or give me some knowledge about this HA solution.... What exactly is this as i have no idea about HA solution.
>> HIGH AVAILABILITY solution, like SAN storage, replication, standby server.
what my doubt is now we are getting error while querying between some range. What is the solution if i have the same with just select count(*) from test_table;
How can i resolve if we found the particular range/records which is causing problem by checking with different ranges.
Is the only solution to load table from backup / direct reloading or do we have any other possibilities where we can resolve the issue like moving the bad blocks data to other blocks....etc.,
>> The best way is to recover from valid backup. If it is not with you & you can bare with minimum loss of data then you can choose
the option ZERO_DAMAGED_PAGES=on. Note that before set this parameter go through the following link.
http://www.postgresql.org/docs/8.4/static/runtime-config-developer.html
>> If you are working on unix then run the fsck and in case of windows run the chkdisk will mark the bad block and move data to
another block.
>> Also you have a option to go for pg_fsck. You can download it from the pg_foundry.
>> One more option, develop one function which will check the ctid if it found the correct ctid then it keep that data in one valid
table and if it is not exist then it go for exception. At the end of execution of function you have a valid ctid record. Rest of the
record insert manually or insert from the backup if you have.
Thanks
Sam Jas
|
|
|
 |
![[Post New]](/templates/default/images/icon_minipost_new.gif) 23 Oct 2009 08:45:18
|
Venkat_B
Member
Joined: 16 Jun 2008 08:39:02
Messages: 45
Offline
|
Thanks for that.
I think pg_fsck is not supported for Pg Plus 8.3 version. We check for the better possibilities which can minimize the administrator risks/tasks.
--Venkat
|
|
|
 |
|
|