In the past I ran into a similar problem which was actually caused by a bug= in hadoop. Someone was nice enough to come up with a workaround for this. = Perhaps you are running into a similar problem. I also had this problem whe= n calling lots of "load file" commands. After adding this to the hive-site.= xml we never had this problem again: hive.fileformat.check false ________________________________ From: Terje Marthinussen [mailto:[email protected]] Sent: Friday, January 07, 2011 4:14 AM To: [email protected] Subject: Re: Too many open files No, the problem is connections to datanodes on port 50010. Terje answered Jan 7 2011 at 07:39 |
Make it 10: 10 -Fuad > answered Oct 23 2009 at 18:33 |
Jun, I observed similar kind of things recently. (didn't notice before because our file limit is huge) I have a set of brokers in a datacenter, and producers in different data centers. At some point I got disconnections, from the producer perspective I had something like 15 connections to the broker. On the other hand on the broker side, I observed hundreds of connections from the producer in an ESTABLISHED state. We had some default settings for the socket timeout on the OS level, which we reduced hoping it would prevent the issue in the future. I'm not sure if the issue is from the broker or OS configuration though. I'm still keeping the broker under observation for the time being. Note that, for clients in the same datacenter, we didn't see this issue, the socket count matches on both ends. Nicolas Berthet answered Sep 25 2013 at 22:06 |
Hi Mark, I'm using centos 6.2. My file limit is something like 500k, the value is arbitrary. net.ipv4.tcp_keepalive_time net.ipv4.tcp_keepalive_intvl net.ipv4.tcp_keepalive_probes I still notice an abnormal number of ESTABLISHED connections, I've been doing some search and came over this page (http://www.lognormal.com/blog/2012/09/27/linux-tcpip-tuning/) I'll change the "net.netfilter.nf_conntrack_tcp_timeout_established" as indicated there, it looks closer to the solution to my issue. Are you also experiencing the issue in a cross data center context ? Best regards, Nicolas Berthet answered Sep 26 2013 at 18:41 |
Hi Mark, Sorry for the delay. We're not using a load balancer if it's what you mean by LB. After applying the change I mentioned last time (the netfilter thing), I couldn't see any improvement. We even restart kafka, but since the restart, I saw connection count slowly getting higher. ���������������������������� ������������ ������������ Best regards, Nicolas Berthet answered Oct 4 2013 at 02:15 |
Hi Terje, I have asked about this issue in an earlier thread but never got any response. I get this exception when I am using Hive over Thrift and submitting 1000s of LOAD FILE commands. If you actively monitor the open file count of the user under which I run the hive instance, it keeps on creeping yup for every LOAD FILE command sent to it. I have a temporary fix by increasing the # of open file(s) to 60000+ and then periodically restarting my thrift server (once every 2 days) to release the open file handlers. I would appreciate some feedback. (trying to find my earlier email) Thanks, Viral On Thu, Jan 6, 2011 at 4:57 PM, Terje Marthinussen wrote: > Hi, > > While loading some 10k+ .gz files through HiveServer with LOAD FILE etc. > etc. > > 11/01/06 22:12:42 INFO exec.CopyTask: Copying data from file:XXX.gz to > hdfs://YYY > 11/01/06 22:12:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream > java.net.SocketException: Too many open files > 11/01/06 22:12:42 INFO hdfs.DFSClient: Abandoning block > blk_8251287732961496983_1741138 > 11/01/06 22:12:48 INFO hdfs.DFSClient: Exception in createBlockOutputStream > java.net.SocketException: Too many open files > 11/01/06 22:12:48 INFO hdfs.DFSClient: Abandoning block > blk_-2561354015640936272_1741138 > 11/01/06 22:12:54 WARN hdfs.DFSClient: DataStreamer Exception: > java.io.IOException: Too many open files > at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method) > at sun.nio.ch.EPollArrayWrapper.(EPollArrayWrapper.java:69) > at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:52) > at > sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18) > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407) > at > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > at > java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > at java.io.DataOutputStream.write(DataOutputStream.java:90) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2314) > > 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block > blk_2907917521214666486_1741138 bad datanode[0] 172.27.1.34:50010 > 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block > blk_2907917521214666486_1741138 in pipeline 172.27.1.34:50010, > 172.27.1.4:50010: bad datanode 172.27.1.34:50010 > Exception in thread "DataStreamer for file YYY block blk_29 > 07917521214666486_1741138" java.lang.NullPointerException > at > org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:351) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:313) > at > org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:860) > at org.apache.hadoop.ipc.Client.call(Client.java:720) > at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > at $Proxy9.recoverBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2581) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2102) > at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2265) > > After this, the HiveServer stops working throwing various exceptions due to > too many open files. > > This is from a trunk checkout from yesterday January 6th. > Seems like we are leaking connections to datanodes on port 50010? > > Regards, > Terje > answered Jan 7 2011 at 02:15 |
You mentioned that you got the code from trunk so fair to assume you are not hitting https://issues.apache.org/jira/browse/HIVE-1508 Worth checking still. Are all the open files - hive history files (they look like hive_job_log*.txt) ? Like Viral suggested you can check that by monitoring open files. -Shrijeet answered Jan 7 2011 at 02:46 |
No, the problem is connections to datanodes on port 50010. Terje answered Jan 7 2011 at 03:14 |
Seems like this works for me too! That probably saved me for a bunch of hours tracing this down through hive and hadoop Do you know what the side effect of setting this to false would be?. Thanks! Terje answered Jan 7 2011 at 10:42 |
From what I understood it will then be possible to tell hive it's loading a csv while you are in fact loading something else (sequence files forinstance). I don't think that's a big deal. Op 7 jan. 2011 om 11:41 heeft "Terje Marthinussen" het volgende geschreven: Seems like this works for me too! That probably saved me for a bunch of hours tracing this down through hive and hadoop Do you know what the side effect of setting this to false would be?. Thanks! Terje On Fri, Jan 7, 2011 at 4:39 PM, Bennie Schut wrote: In the past I ran into a similar problem which was actually caused by a bug in hadoop. Someone was nice enough to come up with a workaround for this. Perhaps you are running into a similar problem. I also had this problem when calling lots of “load file” commands. After adding this to the hive-site.xml we never had this problem again: hive.fileformat.check false ________________________________ From: Terje Marthinussen [mailto:[email protected]] Sent: Friday, January 07, 2011 4:14 AM To: [email protected] Subject: Re: Too many open files No, the problem is connections to datanodes on port 50010. Terje On Fri, Jan 7, 2011 at 11:46 AM, Shrijeet Paliwal wrote: You mentioned that you got the code from trunk so fair to assume you are not hitting https://issues.apache.org/jira/browse/HIVE-1508 Worth checking still. Are all the open files - hive history files (they look like hive_job_log*.txt) ? Like Viral suggested you can check that by monitoring open files. -Shrijeet On Thu, Jan 6, 2011 at 6:15 PM, Viral Bajaria wrote: > Hi Terje, > > I have asked about this issue in an earlier thread but never got any > response. > > I get this exception when I am using Hive over Thrift and submitting 1000s > of LOAD FILE commands. If you actively monitor the open file count of the > user under which I run the hive instance, it keeps on creeping yup for every > LOAD FILE command sent to it. > > I have a temporary fix by increasing the # of open file(s) to 60000+ and > then periodically restarting my thrift server (once every 2 days) to release > the open file handlers. > > I would appreciate some feedback. (trying to find my earlier email) > > Thanks, > Viral > > On Thu, Jan 6, 2011 at 4:57 PM, Terje Marthinussen > wrote: >> >> Hi, >> While loading some 10k+ .gz files through HiveServer with LOAD FILE etc. >> etc. >> 11/01/06 22:12:42 INFO exec.CopyTask: Copying data from file:XXX.gz to >> hdfs://YYY >> 11/01/06 22:12:42 INFO hdfs.DFSClient: Exception in >> createBlockOutputStream java.net.SocketException: Too many open files >> 11/01/06 22:12:42 INFO hdfs.DFSClient: Abandoning block >> blk_8251287732961496983_1741138 >> 11/01/06 22:12:48 INFO hdfs.DFSClient: Exception in >> createBlockOutputStream java.net.SocketException: Too many open files >> 11/01/06 22:12:48 INFO hdfs.DFSClient: Abandoning block >> blk_-2561354015640936272_1741138 >> 11/01/06 22:12:54 WARN hdfs.DFSClient: DataStreamer Exception: >> java.io.IOException: Too many open files >> at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method) >> at sun.nio.ch.EPollArrayWrapper.(EPollArrayWrapper.java:69) >> at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:52) >> at >> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18) >> at >> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407) >> at >> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322) >> at >> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) >> at >> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) >> at >> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) >> at >> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) >> at java.io.DataOutputStream.write(DataOutputStream.java:90) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2314) >> 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block >> blk_2907917521214666486_1741138 bad datanode[0] 172.27.1.34:50010 >> 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block >> blk_2907917521214666486_1741138 in pipeline 172.27.1.34:50010, >> 172.27.1.4:50010: bad datanode 172.27.1.34:50010 >> Exception in thread "DataStreamer for file YYY block blk_29 >> 07917521214666486_1741138" java.lang.NullPointerException >> at >> org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:351) >> at >> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:313) >> at >> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) >> at org.apache.hadoop.ipc.Client.getConnection(Client.java:860) >> at org.apache.hadoop.ipc.Client.call(Client.java:720) >> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) >> at $Proxy9.recoverBlock(Unknown Source) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2581) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2102) >> at >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2265) >> After this, the HiveServer stops working throwing various exceptions due >> to too many open files. >> This is from a trunk checkout from yesterday January 6th. >> Seems like we are leaking connections to datanodes on port 50010? >> Regards, >> Terje > answered Jan 7 2011 at 10:59 |
In /etc/security/limits.conf on all nodes, I see: * soft nofile 65535 * hard nofile 65535 HBase 0.90 RC3 is deployed on this cluster, in regionserver log I see: 2011-01-18 07:41:32,887 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_628272724324759643_2784573 from any node: java.io.IOException: No live nodes contain current block. Will get new bloc= k locations from namenode and retry... 2011-01-18 07:41:32,887 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_-266346913956002831_2784643 from any node: java.io.IOException: No live nodes contain current block. Will get new bloc= k locations from namenode and retry... ... 2011-01-18 07:41:32,889 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_5858710028860745380_2785280 from any node: java.io.IOException: No live nodes contain current block. Will get new bloc= k locations from namenode and retry... Is there other parameter I should tune ? Thanks answered Jan 18 2011 at 17:48 |
Ted: The first line in hbase logs is what hbase sees for ulimit. Check your log. Whats it say? There is a bit on ulimit on ubuntu if that is what you are running here in the hbase book: http://people.apache.org/~stack/hbase-0.90.0-candidate-3/docs/notsoquick.html#ulimit St.Ack answered Jan 18 2011 at 18:49 |
Please don't email the 'issues' list. http://wiki.apache.org/hadoop/Hbase/Troubleshooting#A6 2010/6/14 chen peng : > > hi, all: > I had met a question after my program continued for 28+ hours under the circumstances of cluster which have three machine that had set ulimit to 32K. > ............ > 2010-06-13 02:06:14,812 INFOorg.apache.hadoop.hbase.regionserver.HRegion: Closednutchtabletest,com.pacsun.shop:http/js_external/sj_flyout.js,1276322851971 > 2010-06-13 02:06:15,373 INFOorg.apache.hadoop.hbase.regionserver.HRegion: regionnutchtabletest,com.cableorganizer:http/briggs-stratton-generators/storm-ready-kit.htm\x3F=recommended,1276391174177/739848001available; sequence id is 15639538 > 2010-06-13 02:06:15,373 INFOorg.apache.hadoop.hbase.regionserver.HRegionServer: Worker:MSG_REGION_OPEN:nutchtabletest,com.cableorganizer:http/fire-protection/composite-sheet-pillows.html,1276391174177 > 2010-06-13 02:06:15,589 INFOorg.apache.hadoop.hbase.regionserver.HRegion: regionnutchtabletest,com.cableorganizer:http/fire-protection/composite-sheet-pillows.html,1276391174177/1831848882available; sequence id is 15639539 > 2010-06-13 02:06:15,645 INFOorg.apache.hadoop.hbase.regionserver.CompactSplitThread: region split,META updated, and report to master all successful. Old region=REGION=> {NAME =>'nutchtabletest,com.pacsun.shop:http/js_external/sj_flyout.js,1276322851971',STARTKEY => 'com.pacsun.shop:http/js_external/sj_flyout.js', ENDKEY=>'com.samash.www:http/webapp/wcs/stores/servlet/search_-1_10052_10002_UTF-8___t\x253A3\x252F\x252F\x253Assl\x252F\x252Fsa\x2Bbundle\x2Btaxonomy\x252F\x252F\x253AAccessories\x253ARecording\x2BAccessories\x253AAcoustic\x2BTreatment\x253ABass\x2BTraps__UnitsSold\x252F\x252F1_-1_20__________0_-1__DrillDown___182428_',ENCODED => 908568317, OFFLINE => true, SPLIT => true, TABLE=> {{NAME => 'nutchtabletest', FAMILIES => [{NAME => 'bas',COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647',BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>'true'}, {NAME => 'cnt', COMPRESSION => 'NONE', VERSIONS =>'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY = > >'false', BLOCKCACHE => 'true'}, {NAME => 'cnttyp', COMPRESSION=> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME=> 'fchi', COMPRESSION => 'NONE', VERSIONS => '3', TTL =>'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}, {NAME => 'fcht', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'hdrs',COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647',BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>'true'}, {NAME => 'ilnk', COMPRESSION => 'NONE', VERSIONS =>'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>'false', BLOCKCACHE => 'true'}, {NAME => 'modt', COMPRESSION=> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE=> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME=> 'mtdt', COMPRESSION => 'NONE', VERSIONS => '3', TTL =>'2147483647', BLOCKSIZE => '65536', > IN_MEMORY => 'false',BLOCKCACHE => 'true'}, {NAME => 'olnk', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>'prsstt', COMPRESSION => 'NONE', VERSIONS => '3', TTL =>'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}, {NAME => 'prtstt', COMPRESSION =>'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE =>'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>'prvfch', COMPRESSION => 'NONE', VERSIONS => '3', TTL =>'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}, {NAME => 'prvsig', COMPRESSION =>'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE =>'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>'repr', COMPRESSION => 'NONE', VERSIONS => '3', TTL =>'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}, {NAME => 'rtrs', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCK > SIZE => '65536',IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'scr',COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647',BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>'true'}, {NAME => 'sig', COMPRESSION => 'NONE', VERSIONS =>'3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY =>'false', BLOCKCACHE => 'true'}, {NAME => 'stt', COMPRESSION =>'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE =>'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>'ttl', COMPRESSION => 'NONE', VERSIONS => '3', TTL =>'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',BLOCKCACHE => 'true'}, {NAME => 'txt', COMPRESSION => 'NONE',VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}, new regions:nutchtabletest,com.pacsun.shop:http/js_external/sj_flyout.js,1276391174724,nutchtabletest,com.samash.www:http/p/BR15M 15 2 Way Passive FloorMonitor_-49972869,1276391174724. Split took 0sec > 2010-06-13 02:06:15,645 INFOorg.apache.hadoop.hbase.regionserver.HRegion: Starting compaction onregionnutchtabletest,com.vitaminshoppe.www:http/search/en/category.jsp\x3Ftype=category\x26catId=cat10134,1276366391680 > 2010-06-13 02:06:15,663 INFOorg.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:nutchtabletest,com.pacsun.shop:http/js_external/sj_flyout.js,1276391174724 > 2010-06-13 02:06:15,663 INFOorg.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:nutchtabletest,com.samash.www:http/p/BR15M 15 2 Way Passive FloorMonitor_-49972869,1276391174724 > 2010-06-13 02:06:15,664 INFOorg.apache.hadoop.hbase.regionserver.HRegionServer: Worker:MSG_REGION_OPEN:nutchtabletest,com.pacsun.shop:http/js_external/sj_flyout.js,1276391174724 > 2010-06-13 02:06:16,123 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-5104950836598570436_20226 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:19,582 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-5104950836598570436_20226 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:22,814 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-6330529819693039456_20275 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:23,474 INFOorg.apache.hadoop.hbase.regionserver.HRegion: compaction completed onregionnutchtabletest,com.vitaminshoppe.www:http/search/en/category.jsp\x3Ftype=category\x26catId=cat10134,1276366391680in 7sec > 2010-06-13 02:06:23,474 INFOorg.apache.hadoop.hbase.regionserver.HRegion: Starting compaction onregionnutchtabletest,com.cableorganizer:http/briggs-stratton-generators/storm-ready-kit.htm\x3F=recommended,1276391174177 > 2010-06-13 02:06:26,376 INFOorg.apache.hadoop.hbase.regionserver.HRegion: regionnutchtabletest,com.pacsun.shop:http/js_external/sj_flyout.js,1276391174724/232099566available; sequence id is 15639825 > 2010-06-13 02:06:26,376 INFOorg.apache.hadoop.hbase.regionserver.HRegionServer: Worker:MSG_REGION_OPEN: nutchtabletest,com.samash.www:http/p/BR15M 15 2 WayPassive Floor Monitor_-49972869,1276391174724 > 2010-06-13 02:06:26,598 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-5772421768525630859_20164 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:29,612 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-1227684848175029882_20172 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:32,618 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-8420981703314551273_20168 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:32,672 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-2559191036262569688_20333 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:35,619 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-8420981703314551273_20168 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:35,674 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-2559191036262569688_20333 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:38,637 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-5563912881422417996_20180 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:38,675 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-2559191036262569688_20333 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:41,650 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_2343005765236386064_20192 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:41,677 WARN org.apache.hadoop.hdfs.DFSClient: DFSRead: java.io.IOException: Could not obtain block:blk_-2559191036262569688_20333file=/hbase/nutchtabletest/739848001/ilnk/9220624711093099779 > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) > at java.io.DataInputStream.readFully(DataInputStream.java:178) > at java.io.DataInputStream.readFully(DataInputStream.java:152) > at org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1368) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:848) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:793) > at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:273) > at org.apache.hadoop.hbase.regionserver.StoreFile.(StoreFile.java:129) > at org.apache.hadoop.hbase.regionserver.Store.completeCompaction(Store.java:974) > at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:766) > at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:832) > at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:785) > at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:93) > > > 2010-06-13 02:06:41,677 ERRORorg.apache.hadoop.hbase.regionserver.CompactSplitThread:Compaction/Split failed for regionnutchtabletest,com.cableorganizer:http/briggs-stratton-generators/storm-ready-kit.htm\x3F=recommended,1276391174177 > java.io.IOException: Could not obtain block:blk_-2559191036262569688_20333file=/hbase/nutchtabletest/739848001/ilnk/9220624711093099779 > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) > at java.io.DataInputStream.readFully(DataInputStream.java:178) > at java.io.DataInputStream.readFully(DataInputStream.java:152) > at org.apache.hadoop.hbase.io.hfile.HFile$FixedFileTrailer.deserialize(HFile.java:1368) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readTrailer(HFile.java:848) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.loadFileInfo(HFile.java:793) > at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:273) > at org.apache.hadoop.hbase.regionserver.StoreFile.(StoreFile.java:129) > at org.apache.hadoop.hbase.regionserver.Store.completeCompaction(Store.java:974) > at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:766) > at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:832) > at org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:785) > at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:93) > 2010-06-13 02:06:41,677 INFOorg.apache.hadoop.hbase.regionserver.HRegion: Starting compaction onregionnutchtabletest,com.cableorganizer:http/fire-protection/composite-sheet-pillows.html,1276391174177 > 2010-06-13 02:06:41,693 INFO org.apache.hadoop.hdfs.DFSClient:Exception in createBlockOutputStream java.net.SocketException: Too manyopen files > 2010-06-13 02:06:41,694 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-4724151989818868275_20334 > 2010-06-13 02:06:44,652 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_2343005765236386064_20192 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:47,653 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_2343005765236386064_20192 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:47,695 INFO org.apache.hadoop.hdfs.DFSClient:Exception in createBlockOutputStream java.net.SocketException: Too manyopen files > 2010-06-13 02:06:47,695 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_2197619404089718071_20334 > 2010-06-13 02:06:50,655 WARN org.apache.hadoop.hdfs.DFSClient: DFSRead: java.io.IOException: Could not obtain block:blk_2343005765236386064_20192file=/hbase/nutchtabletest/686991543/fchi/1061870177496593816.908568317 > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1695) > at java.io.DataInputStream.readBoolean(DataInputStream.java:225) > at org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:117) > at org.apache.hadoop.hbase.io.Reference.read(Reference.java:151) > at org.apache.hadoop.hbase.regionserver.StoreFile.(StoreFile.java:126) > at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:410) > at org.apache.hadoop.hbase.regionserver.Store.(Store.java:221) > at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1641) > at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:320) > at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1575) > at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1542) > at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1462) > at java.lang.Thread.run(Thread.java:619) > > > 2010-06-13 02:06:50,655 WARNorg.apache.hadoop.hbase.regionserver.Store: Failed open ofhdfs://ubuntu1:9000/hbase/nutchtabletest/686991543/fchi/1061870177496593816.908568317;presumption is that file was corrupted at flush and lost edits pickedup by commit log replay. Verify! > java.io.IOException: Could not obtain block:blk_2343005765236386064_20192file=/hbase/nutchtabletest/686991543/fchi/1061870177496593816.908568317 > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1695) > at java.io.DataInputStream.readBoolean(DataInputStream.java:225) > at org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:117) > at org.apache.hadoop.hbase.io.Reference.read(Reference.java:151) > at org.apache.hadoop.hbase.regionserver.StoreFile.(StoreFile.java:126) > at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:410) > at org.apache.hadoop.hbase.regionserver.Store.(Store.java:221) > at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1641) > at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:320) > at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1575) > at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1542) > at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1462) > at java.lang.Thread.run(Thread.java:619) > 2010-06-13 02:06:50,659 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-153353228097894218_20196 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:51,804 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-3334740230832671768_20314 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:53,668 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_4832263854844000864_20200 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:53,697 INFO org.apache.hadoop.hdfs.DFSClient:Exception in createBlockOutputStream java.net.SocketException: Too manyopen files > 2010-06-13 02:06:53,697 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-8490642742553142526_20334 > 2010-06-13 02:06:54,806 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-3334740230832671768_20314 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:56,669 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_4832263854844000864_20200 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:57,808 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-3334740230832671768_20314 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:59,670 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_4832263854844000864_20200 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:06:59,698 INFO org.apache.hadoop.hdfs.DFSClient:Exception in createBlockOutputStream java.net.SocketException: Too manyopen files > 2010-06-13 02:06:59,698 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_8167205924627743813_20334 > 2010-06-13 02:07:00,809 WARN org.apache.hadoop.hdfs.DFSClient: DFSRead: java.io.IOException: Could not obtain block:blk_-3334740230832671768_20314file=/hbase/.META./1028785192/info/515957856915851220 > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) > at java.io.DataInputStream.read(DataInputStream.java:132) > at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:105) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1018) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1291) > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:98) > at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:68) > at org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:72) > at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1304) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.initHeap(HRegion.java:1850) > at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.next(HRegion.java:1883) > at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1906) > at org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1877) > at sun.reflect.GeneratedMethodAccessor49.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > > > 2010-06-13 02:07:00,809 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-3334740230832671768_20314 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:07:02,671 WARN org.apache.hadoop.hdfs.DFSClient: DFSRead: java.io.IOException: Could not obtain block:blk_4832263854844000864_20200file=/hbase/nutchtabletest/686991543/fcht/3329079451353795349.908568317 > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1695) > at java.io.DataInputStream.readBoolean(DataInputStream.java:225) > at org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:117) > at org.apache.hadoop.hbase.io.Reference.read(Reference.java:151) > at org.apache.hadoop.hbase.regionserver.StoreFile.(StoreFile.java:126) > at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:410) > at org.apache.hadoop.hbase.regionserver.Store.(Store.java:221) > at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1641) > at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:320) > at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1575) > at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1542) > at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1462) > at java.lang.Thread.run(Thread.java:619) > > > 2010-06-13 02:07:02,674 WARNorg.apache.hadoop.hbase.regionserver.Store: Failed open ofhdfs://ubuntu1:9000/hbase/nutchtabletest/686991543/fcht/3329079451353795349.908568317;presumption is that file was corrupted at flush and lost edits pickedup by commit log replay. Verify! > java.io.IOException: Could not obtain block:blk_4832263854844000864_20200file=/hbase/nutchtabletest/686991543/fcht/3329079451353795349.908568317 > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1812) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1638) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1767) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1695) > at java.io.DataInputStream.readBoolean(DataInputStream.java:225) > at org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:117) > at org.apache.hadoop.hbase.io.Reference.read(Reference.java:151) > at org.apache.hadoop.hbase.regionserver.StoreFile.(StoreFile.java:126) > at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:410) > at org.apache.hadoop.hbase.regionserver.Store.(Store.java:221) > at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1641) > at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:320) > at org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1575) > at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1542) > at org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1462) > at java.lang.Thread.run(Thread.java:619) > 2010-06-13 02:07:02,676 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_8179737564656994784_20204 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:07:03,817 WARN org.apache.hadoop.hdfs.DFSClient: Failedto connect to /172.0.8.251:50010 for file/hbase/.META./1028785192/info/515957856915851220 for block-3334740230832671768:java.net.SocketException: Too many open files > at sun.nio.ch.Net.socket0(Native Method) > at sun.nio.ch.Net.socket(Net.java:94) > at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:84) > at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) > at java.nio.channels.SocketChannel.open(SocketChannel.java:105) > at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:1847) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1922) > at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46) > at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1018) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.loadBlock(HFile.java:1300) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1182) > at org.apache.hadoop.hbase.regionserver.Store.seekToScanner(Store.java:1164) > at org.apache.hadoop.hbase.regionserver.Store.rowAtOrBeforeFromStoreFile(Store.java:1131) > at org.apache.hadoop.hbase.regionserver.Store.getRowKeyAtOrBefore(Store.java:1092) > at org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1147) > at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1729) > at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > > > 2010-06-13 02:07:03,820 WARN org.apache.hadoop.hdfs.DFSClient: Failedto connect to /172.0.8.248:50010 for file/hbase/.META./1028785192/info/515957856915851220 for block-3334740230832671768:java.net.SocketException: Too many open files > at sun.nio.ch.Net.socket0(Native Method) > at sun.nio.ch.Net.socket(Net.java:94) > at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:84) > at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) > at java.nio.channels.SocketChannel.open(SocketChannel.java:105) > at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:1847) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1922) > at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46) > at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1018) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.loadBlock(HFile.java:1300) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1182) > at org.apache.hadoop.hbase.regionserver.Store.seekToScanner(Store.java:1164) > at org.apache.hadoop.hbase.regionserver.Store.rowAtOrBeforeFromStoreFile(Store.java:1131) > at org.apache.hadoop.hbase.regionserver.Store.getRowKeyAtOrBefore(Store.java:1092) > at org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1147) > at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1729) > at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > > > 2010-06-13 02:07:03,820 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: > java.net.SocketException: Too many open files > at sun.nio.ch.Net.socket0(Native Method) > at sun.nio.ch.Net.socket(Net.java:94) > at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:84) > at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) > at java.nio.channels.SocketChannel.open(SocketChannel.java:105) > at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:1847) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1922) > at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46) > at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1018) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.loadBlock(HFile.java:1300) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1182) > at org.apache.hadoop.hbase.regionserver.Store.seekToScanner(Store.java:1164) > at org.apache.hadoop.hbase.regionserver.Store.rowAtOrBeforeFromStoreFile(Store.java:1131) > at org.apache.hadoop.hbase.regionserver.Store.getRowKeyAtOrBefore(Store.java:1092) > at org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1147) > at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1729) > at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > 2010-06-13 02:07:03,820 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_-3334740230832671768_20314 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:07:03,827 INFO org.apache.hadoop.ipc.HBaseServer: IPCServer handler 6 on 60020, call getClosestRowBefore([B@d0a973,[B@124c6ab, [B@16f25f6) from 172.0.8.251:36613: error:java.net.SocketException: Too many open files > java.net.SocketException: Too many open files > at sun.nio.ch.Net.socket0(Native Method) > at sun.nio.ch.Net.socket(Net.java:94) > at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:84) > at sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:37) > at java.nio.channels.SocketChannel.open(SocketChannel.java:105) > at org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:1847) > at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1922) > at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46) > at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101) > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1018) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:966) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.loadBlock(HFile.java:1300) > at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1182) > at org.apache.hadoop.hbase.regionserver.Store.seekToScanner(Store.java:1164) > at org.apache.hadoop.hbase.regionserver.Store.rowAtOrBeforeFromStoreFile(Store.java:1131) > at org.apache.hadoop.hbase.regionserver.Store.getRowKeyAtOrBefore(Store.java:1092) > at org.apache.hadoop.hbase.regionserver.HRegion.getClosestRowBefore(HRegion.java:1147) > at org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1729) > at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:657) > at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > 2010-06-13 02:07:05,677 INFO org.apache.hadoop.hdfs.DFSClient: Couldnot obtain block blk_8179737564656994784_20204 from any node: java.io.IOException: No live nodes contain current block > 2010-06-13 02:07:05,699 WARN org.apache.hadoop.hdfs.DFSClient:DataStreamer Exception: java.io.IOException: Unable to create new block. > at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2845) > at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2102) > at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2288) > _________________________________________________________________ > Hotmail: Trusted email with Microsoft’s powerful SPAM protection. > https://signup.live.com/signup.aspx?id=60969 answered Jun 14 2010 at 19:02 |
Michael, Not sure if you still have that problem, but I got it too and here is how to fix it : http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linux-kernel-2627-epoll-limits/ J-D answered Mar 13 2009 at 14:19 |
Are you using the java producer client? Thanks, Jun answered Sep 24 2013 at 21:01 |
No. We are using the kafka-rb ruby gem producer. https://github.com/acrosa/kafka-rb Now that you asked that question I need to ask. Is there a problem with the java producer? Sent from my iPhone answered Sep 25 2013 at 06:08 |
We haven't seen any socket leaks with the java producer. If you have lots of unexplained socket connections in established mode, one possible cause is that the client created new producer instances, but didn't close the old Thanks, Jun answered Sep 25 2013 at 09:06 |
Any other ideas? answered Sep 25 2013 at 16:30 |
FYI if I kill all producers I don't see the number of open files drop. I still see all the ESTABLISHED connections. Is there a broker setting to automatically kill any inactive TCP connections? answered Sep 25 2013 at 16:48 |
If a client is gone, the broker should automatically close those broken sockets. Are you using a hardware load balancer? Thanks, Jun answered Sep 25 2013 at 21:38 |
Are you using the java or non-java producer? Are you using ZK based, broker-list based, or VIP based producer? Thanks, Jun answered Sep 26 2013 at 07:37 |
We are using a hardware loadbalancer with a VIP based ruby producer. answered Sep 26 2013 at 15:07 |
What OS settings did you change? How high is your huge file limit? answered Sep 26 2013 at 15:07 |
No, this is all within the same DC. I think the problem has to do with the LB. We've upgraded our producers to point directory to a node for testing and after running it all night, I don't see any more connections then there are supposed to be. Can I ask which LB are you using? We are using A10's answered Sep 27 2013 at 09:35 |
Hi Nicolas, we did run into a similar issue here (lots of ESTABLISHED connections on the brokers, but non on the consumers/producers). In our case, it was a firewall issue where connections that were inactive for more than a certain time were silently dropped by the firewall (but no TCP RST was sent) and only one side of the connection noticed the drop. Maybe that helps Flo answered Oct 4 2013 at 05:14 |
Hi, You can use JMX tools like console and connect to the broker, check the connections and their IP addresses. I suggest you first find out to which client process these many connections belong and then look at the client code/configuration to find out why connections are not closed. I am not aware of any bugs in this area, so its likely a user / configuration problem. Regards, Torsten Mielke tmielke.blogspot.com answered Jan 13 2014 at 01:51 |
By default, ActiveMQ closes connections asynchronously. That is normally not a problem, unless you open and close many connections (in our case, hundreds per minute). In that case, the asynchronous closing may not be able to keep up with your opening/closing-rate and you start accumulating file descriptors. There is a transport-level configuration switch that you can set in your activemq.xml to switch that to synchronous: We have it for stomp: But as said, this should not happen when you have just a low open/close rate. By the way, if you allow 1000 connections at maximum and expect to get close to that number... you'll need to raise the limit on most Linux-distro's. The default tends to be 1024 (but that's not enough for 1000 connections, since you need to account for file descriptors for libraries, storage files, etc). We have added this to our /etc/init.d/activemq-script ulimit -n 102400 Best regards, Arjen answered Jan 13 2014 at 04:07 |
The 1000 maximum connections is an insane number for a single connection pool. I'm hard-pressed to imagine a scenario in which such a large number of connections can improve performance. Why the large number? View this message in context: http://activemq.2283324.n4.nabble.com/Too-many-open-files-tp4676228p4676259.html Sent from the ActiveMQ - User mailing list archive at Nabble.com. answered Jan 13 2014 at 15:32 |
2014/1/13 Michael Priess : Why dont use PooledConnectionFactory? Regards answered Jan 15 2014 at 01:16 |
Hi, correct me if I'm wrong but if I listen to a topic the connection must be always open? In my thinking a PooledConnectionFactory make only sense if I'm sending a lot of messages to the broker. BTW. know someone the exactly number of file handles which are open per connection on the server? 2014/1/15 Jose Mar�a Zaragoza answered Jan 16 2014 at 00:54 |
I think he was referring to your clients which could be adjusted to pool a small amount of connections, rather than opening many. You obviously can't control this from the server-side. Still its very uncommon to really need 1000 connections at the same time, it generally suggests your clients open way to many connections. Or they're opened and closed very fast, so the server can't keep up in its default configuration and has many "waiting to be closed connections". For each connection to your server there is a single file pointer. You can, amongsts others, use 'lsof' to see how many for your java-proces (see man lsof for usage examples). You can also see connections in CLOSE_WAIT using lsof. If you really have that many clients, there isn't much you can do. You're limited to adjusting the server in such a way it can hold more file pointers (i.e. use ulimit) or change your server-architecture to have more servers (with that many clients, some redundancy is a good thing anyway). Best regards, Arjen answered Jan 16 2014 at 01:19 |
2014/1/16 Arjen van der Meijden : Well, I don't know so much about Apache Camel , but in http://camel.apache.org/activemq.html you can use PooledConnectionFactory Regards answered Jan 16 2014 at 01:54 |
In general the use of a pooling aware JMS ConnectionFactory such as ActiveMQs PooledConnectionFactory is *always* recommended. Particularly in Camel which uses Spring JMS underneath. Regards, Torsten Mielke tmielke.blogspot.com answered Jan 16 2014 at 02:14 |
I think this has been fixed in master. Can you try building the new jar and trying that? 2.4 will be cut very soon. - show quoted text - > -- > You received this message because you are subscribed to the Google Groups "mongodb-user" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to [email protected]. > For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en. > > answered Dec 11 2010 at 21:14 |
When I encountered this problem, I checked ulimit -n and /proc/sys/net/ipv4/tcp_fin_timeout. - show quoted text - Hi, - show quoted text - import com.mongodb.*; - show quoted text - - show quoted text - - show quoted text - - show quoted text - Thanks. -- You received this message because you are subscribed to the Google Groups "mongodb-user" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en. answered Dec 11 2010 at 21:16 |
Thanks Eliot. I built a 2.4-SNAPSHOT jar from master and deployed it onto my (not yet live) production servers, and it appears to have completely fixed the problem. - show quoted text - answered Dec 12 2010 at 14:04 |
You can use the --maxConns parameter to limit the number of connections. But it seems like your client app isn't closing connections correctly. - show quoted text - > -- > You received this message because you are subscribed to the Google Groups "mongodb-user" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to [email protected]. > For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en. > > answered Apr 6 2010 at 12:39 |
It may be possible to set these globally in /etc/system, e.g., (untried!) set rlim_fd_max=65536 set rlim_fd_cur=256 -- Richard - show quoted text - > -- > You received this message because you are subscribed to the Google Groups "mo > ngodb-user" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to mongodb-user+unsubscribe@google > groups.com. > For more options, visit this group at http://groups.google.com/group/mongodb- > user?hl=en. > answered Apr 6 2010 at 12:42 |
Mathias, I think I agree. But I thought if my connection object falls out of scope, the connection would be implicitly closed. Whats the best way to make sure that it happens? Is there an implicit disconnect? I m using the Java API. Cheers - show quoted text - answered Apr 6 2010 at 16:25 |
The Mongo object handles connection pooling for you. Have you read http://www.mongodb.org/display/DOCS/Java+Driver+Concurrency ? Are you using a single Mongo object, or one per request? - show quoted text - > You received this message because you are subscribed to the Google Groups "mongodb-user" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to [email protected]. > For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en. > > answered Apr 6 2010 at 16:33 |
If a Mongo goes out of scope, the connection won't be closed until gc picks it up, which could be days. You should use 1 Mongo instance per jvm most of the time. Also - are you sure its sockets? can you do "netstat -an | grep 27017 | wc -l" on the server? On Tue, Apr 6, 2010 at 7:18 PM, ankimal wrote: - show quoted text - > You received this message because you are subscribed to the Google Groups "mongodb-user" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to [email protected]. > For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en. > > answered Apr 6 2010 at 19:11 |
Hi Rohit, you mention a “potential leak” and “getting similar stack traces”. How and why exactly are these stack traces emitted? Due to a crash? One thing that comes to mind are thread pool sizes: if you have really big pools configured, then you will see lots of file descriptors being used (two per thread on JVMs I looked at). Another thing: have you made certain that you don’t create ActorSystems other than during application start-up? (we had reports of this kind where someone had used `def` where `val` was meant—it happens to the best!) In any case, we would need to see the thread dump and the akka config from a failing system in order to be certain. Regards, Roland 13 mar 2014 kl. 01:43 skrev Rohit Gupta : - show quoted text - -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout. Dr. Roland Kuhn Akka Tech Lead Typesafe – Reactive apps on the JVM. twitter: @rolandkuhn answered Mar 13 2014 at 15:31 |
Hi Roland, you mention a “potential leak” and “getting similar stack traces”. How and why exactly are these stack traces emitted? Due to a crash? These stack traces were seen in thread dumps when the application was running. We suspect the number of threads with this trace accumulate over time/load. The thing we are worried about is the theads for netty connections, there are 456 similar threads. [Attached file] Another thing: have you made certain that you don’t create ActorSystems other than during application start-up? (we had reports of this kind where someone had used `def` where `val` was meant—it happens to the best!) We create actorSystem only during start-up and use 'def' In any case, we would need to see the thread dump and the akka config from a failing system in order to be certain. We are using the akka default configuration for our application, just changing the host/port and logging properties. I am attaching the thread dumps. Thanks, -Rohit answered Mar 18 2014 at 17:43 |
Hi Rohit, 19 mar 2014 kl. 01:36 skrev Rohit Gupta : Hi Roland, you mention a “potential leak” and “getting similar stack traces”. How and why exactly are these stack traces emitted? Due to a crash? These stack traces were seen in thread dumps when the application was running. We suspect the number of threads with this trace accumulate over time/load. The thing we are worried about is the theads for netty connections, there are 456 similar threads. [Attached file] Another thing: have you made certain that you don’t create ActorSystems other than during application start-up? (we had reports of this kind where someone had used `def` where `val` was meant—it happens to the best!) We create actorSystem only during start-up and use 'def' From the log it is clear that you have currently three instances of the same actor system name running (and possibly had more before): make sure that you really only create the ActorSystem once, e.g. by making it a `val`. In any case, we would need to see the thread dump and the akka config from a failing system in order to be certain. We are using the akka default configuration for our application, just changing the host/port and logging properties. I am attaching the thread dumps. I gave it a quick glance and what I see does not point towards a defect it Akka (you do seem to have Netty instances running on that server without properly capping its worker pools; I would guess that the machine has a lot of cores, right?). We cannot debug production issues on this mailing list for free—we simply don’t have the bandwidth—but we can certainly help you on commercial terms; in that case just ping me off-list. Regards, Roland Thanks, -Rohit -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout. Dr. Roland Kuhn Akka Tech Lead Typesafe – Reactive apps on the JVM. twitter: @rolandkuhn answered Mar 19 2014 at 02:10 |
I followed the instructions here to resolve this - http://posidev.com/blog/2009/06/04/set-ulimit-parameters-on-ubuntu/ - show quoted text - answered Sep 30 2011 at 14:26 |
Group Incubator-kafka-users
asked Sep 24 2013 at 17:33
active Sep 30 2011 at 14:26
posts:46
users:27