QnaList > Groups > Spark-User > Jul 2014
faq

Run Spark Unit Test On Windows 7

Hi all,
I'm trying to run some transformation on *Spark*, it works fine on cluster
(YARN, linux machines). However, when I'm trying to run it on local machine
(*Windows 7*) under unit test, I got errors:
java.io.IOException: Could not locate executable null\bin\winutils.exe
in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.(Shell.java:326)
at org.apache.hadoop.util.StringUtils.(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
My code is following:
@Test
def testETL() = {
    val conf = new SparkConf()
    val sc = new SparkContext("local", "test", conf)
    try {
        val etl = new IxtoolsDailyAgg() // empty constructor
        val data = sc.parallelize(List("in1", "in2", "in3"))
        etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
        Assert.assertTrue(true)
    } finally {
        if(sc != null)
            sc.stop()
    }
}
Why is it trying to access hadoop at all? and how can I fix it? Thank you
in advance
Thank you,
Konstantin Kudryavtsev

asked Jul 2 2014 at 09:38

Konstantin Kudryavtsev 's gravatar image



7 Replies for : Run Spark Unit Test On Windows 7
Hi Konstatin,
We use hadoop as a library in a few places in Spark. I wonder why the path
includes "null" though.
Could you provide the full stack trace?
Andrew
2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev <
[email protected]>:

answered Jul 2 2014 at 10:15

Andrew Or 's gravatar image


Hi Andrew,
it's windows 7 and I doesn't set up any env variables here
The full stack trace:
14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the
hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in
the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
 at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.(Shell.java:326)
 at org.apache.hadoop.util.StringUtils.(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
 at org.apache.hadoop.security.Groups.(Groups.java:77)
at
org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
 at
org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
at
org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
 at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:36)
at
org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:109)
 at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala)
at org.apache.spark.SparkContext.(SparkContext.scala:228)
 at org.apache.spark.SparkContext.(SparkContext.scala:97)
at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
 at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
 at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
 at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
 at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
 at
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
at
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
 at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Thank you,
Konstantin Kudryavtsev

answered Jul 2 2014 at 10:20

Konstantin Kudryavtsev 's gravatar image


By any chance do you have HDP 2.1 installed? you may need to install the utils and update the env variables per http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows

answered Jul 2 2014 at 11:10

Denny Lee 's gravatar image


No, I don�t
why do I need to have HDP installed? I don�t use Hadoop at all and I�d like to read data from local filesystem

answered Jul 2 2014 at 12:04

Kostiantyn Kudriavtsev 's gravatar image


You don't actually need it per se - its just that some of the Spark
libraries are referencing Hadoop libraries even if they ultimately do not
call them. When I was doing some early builds of Spark on Windows, I
admittedly had Hadoop on Windows running as well and had not run into this
particular issue.

answered Jul 2 2014 at 12:24

Denny Lee 's gravatar image


It sounds really strange...
I guess it is a bug, critical bug and must be fixed... at least some flag
must be add (unable.hadoop)
I found the next workaround :
1) download compiled winutils.exe from
http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")
after that test runs
Thank you,
Konstantin Kudryavtsev

answered Jul 2 2014 at 23:44

Konstantin Kudryavtsev 's gravatar image


Hi Konstantin,
Could you please create a jira item at: https://issues.apache.org/jira/browse/SPARK/ so this issue can be tracked?
Thanks,
Denny
It sounds really strange...
I guess it is a bug, critical bug and must be fixed... at least some flag must be add (unable.hadoop)
I found the next workaround :
1) download compiled winutils.exe from http://social.msdn.microsoft.com/Forums/windowsazure/en-US/28a57efb-082b-424b-8d9e-731b1fe135de/please-read-if-experiencing-job-failures?forum=hdinsight
2) put this file into d:\winutil\bin
3) add in my test: System.setProperty("hadoop.home.dir", "d:\\winutil\\")
after that test runs
Thank you,
Konstantin Kudryavtsev
You don't actually need it per se - its just that some of the Spark libraries are referencing Hadoop libraries even if they ultimately do not call them. When I was doing some early builds of Spark on Windows, I admittedly had Hadoop on Windows running as well and had not run into this particular issue.
No, I don’t
why do I need to have HDP installed? I don’t use Hadoop at all and I’d like to read data from local filesystem
By any chance do you have HDP 2.1 installed? you may need to install the utils and update the env variables per http://stackoverflow.com/questions/18630019/running-apache-hadoop-2-1-0-on-windows
Hi Andrew,
it's windows 7 and I doesn't set up any env variables here 
The full stack trace:
14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.(Shell.java:326)
at org.apache.hadoop.util.StringUtils.(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
at org.apache.hadoop.security.Groups.(Groups.java:77)
at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
at org.apache.spark.deploy.SparkHadoopUtil.(SparkHadoopUtil.scala:36)
at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala:109)
at org.apache.spark.deploy.SparkHadoopUtil$.(SparkHadoopUtil.scala)
at org.apache.spark.SparkContext.(SparkContext.scala:228)
at org.apache.spark.SparkContext.(SparkContext.scala:97)
at my.example.EtlTest.testETL(IxtoolsDailyAggTest.scala:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:81)
at org.junit.runner.JUnitCore.run(JUnitCore.java:130)
at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:74)
at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:211)
at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:67)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Thank you,
Konstantin Kudryavtsev
Hi Konstatin,
We use hadoop as a library in a few places in Spark. I wonder why the path includes "null" though.
Could you provide the full stack trace?
Andrew
2014-07-02 9:38 GMT-07:00 Konstantin Kudryavtsev :
Hi all,
I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors:
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
at org.apache.hadoop.util.Shell.(Shell.java:326)
at org.apache.hadoop.util.StringUtils.(StringUtils.java:76)
at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
My code is following:
@Test
def testETL() = {
    val conf = new SparkConf()
    val sc = new SparkContext("local", "test", conf)
    try {
        val etl = new IxtoolsDailyAgg() // empty constructor
        val data = sc.parallelize(List("in1", "in2", "in3"))
        etl.etl(data) // rdd transformation, no access to SparkContext or Hadoop
        Assert.assertTrue(true)
    } finally {
        if(sc != null)
            sc.stop()
    }
}
Why is it trying to access hadoop at all? and how can I fix it? Thank you in advance
Thank you,
Konstantin Kudryavtsev

answered Jul 3 2014 at 09:06

Denny Lee 's gravatar image


Related discussions

Tagged

Group Spark-user

asked Jul 2 2014 at 09:38

active Jul 3 2014 at 09:06

posts:8

users:4

Spark-dev

Spark-user

©2013 QnaList.com