Enum value implementing Writable interface of Hadoop - java

Suppose I have an enumeration:
public enum SomeEnumType implements Writable {
A(0), B(1);
private int value;
private SomeEnumType(int value) {
this.value = value;
}
#Override
public void write(final DataOutput dataOutput) throws IOException {
dataOutput.writeInt(this.value);
}
#Override
public void readFields(final DataInput dataInput) throws IOException {
this.value = dataInput.readInt();
}
}
I want to pass an instance of it as a part of some other class instance.
The equals would not work, because it will not consider the inner variable of enumeration, plus all enum instances are fixed at compile time and could not be created elsewhere.
Does it mean I could not send enums over the wire in Hadoop or there's a solution?

My normal and preferred solution for enums in Hadoop is serializing the enums through their ordinal value.
public class EnumWritable implements Writable {
static enum EnumName {
ENUM_1, ENUM_2, ENUM_3
}
private int enumOrdinal;
// never forget your default constructor in Hadoop Writables
public EnumWritable() {
}
public EnumWritable(Enum<?> arbitraryEnum) {
this.enumOrdinal = arbitraryEnum.ordinal();
}
public int getEnumOrdinal() {
return enumOrdinal;
}
#Override
public void readFields(DataInput in) throws IOException {
enumOrdinal = in.readInt();
}
#Override
public void write(DataOutput out) throws IOException {
out.writeInt(enumOrdinal);
}
public static void main(String[] args) {
// use it like this:
EnumWritable enumWritable = new EnumWritable(EnumName.ENUM_1);
// let Hadoop do the write and read stuff
EnumName yourDeserializedEnum = EnumName.values()[enumWritable.getEnumOrdinal()];
}
}
Obviously it has drawbacks: Ordinals can change, so if you exchange ENUM_2 with ENUM_3 and read a previously serialized file, this will return the other wrong enum.
So if you know the enum class beforehand, you can write the name of your enum and use it like this:
enumInstance = EnumName.valueOf(in.readUTF());
This will use slightly more space, but it is more save to changes to your enum names.
The full example would look like this:
public class EnumWritable implements Writable {
static enum EnumName {
ENUM_1, ENUM_2, ENUM_3
}
private EnumName enumInstance;
// never forget your default constructor in Hadoop Writables
public EnumWritable() {
}
public EnumWritable(EnumName e) {
this.enumInstance = e;
}
public EnumName getEnum() {
return enumInstance;
}
#Override
public void write(DataOutput out) throws IOException {
out.writeUTF(enumInstance.name());
}
#Override
public void readFields(DataInput in) throws IOException {
enumInstance = EnumName.valueOf(in.readUTF());
}
public static void main(String[] args) {
// use it like this:
EnumWritable enumWritable = new EnumWritable(EnumName.ENUM_1);
// let Hadoop do the write and read stuff
EnumName yourDeserializedEnum = enumWritable.getEnum();
}
}

WritableUtils has convenience methods that make this easier.
WritableUtils.writeEnum(dataOutput,enumData);
enumData = WritableUtils.readEnum(dataInput,MyEnum.class);

I don't know anything about Hadoop, but based on the documentation of the interface, you could probably do it like that:
public void readFields(DataInput in) throws IOException {
// do nothing
}
public static SomeEnumType read(DataInput in) throws IOException {
int value = in.readInt();
if (value == 0) {
return SomeEnumType.A;
}
else if (value == 1) {
return SomeEnumType.B;
}
else {
throw new IOException("Invalid value " + value);
}
}

Related

How to implement custom writable's write and readFields for a map or list of custom object?

I have a wrapper class that contains the following
class myWrapperClass {
Map<Long, myInnerClass> myMap;
int myInt1;
int myInt2;
}
class myInnerClass {
int myInnerInt;
long myInnerLong;
}
I want to have a customWritable so far I have the following
#Override
public void write(DataOutput out) throws IOException {
out.writeInt(this.myInt1);
out.writeInt(this.myInt2);
// What do I do here
}
#Override
public void readFields(DataInput in) throws IOException {
myInt1 = in.readInt();
myInt2 = in.readShort();
// What do I do here
}
I am not sure how I would write to and read from DataOutput if I have a custom object.
Can someone point me a direction?

Hadoop mapreduce custom writable static context

I'm working on an university homework and we have to use hadoop mapreduce for it. I'm trying to create a new custom writable as I want to output key-value pairs as (key, (doc_name, 1)).
public class Detector {
private static final Path TEMP_PATH = new Path("temp");
private static final String LENGTH = "gramLength";
private static final String THRESHOLD = "threshold";
public class Custom implements Writable {
private Text document;
private IntWritable count;
public Custom(){
setDocument("");
setCount(0);
}
public Custom(String document, int count) {
setDocument(document);
setCount(count);
}
#Override
public void readFields(DataInput in) throws IOException {
// TODO Auto-generated method stub
document.readFields(in);
count.readFields(in);
}
#Override
public void write(DataOutput out) throws IOException {
document.write(out);
count.write(out);
}
public int getCount() {
return count.get();
}
public void setCount(int count) {
this.count = new IntWritable(count);
}
public String getDocument() {
return document.toString();
}
public void setDocument(String document) {
this.document = new Text(document);
}
}
public static class NGramMapper extends Mapper<Text, Text, Text, Text> {
private int gramLength;
private Pattern space_pattern=Pattern.compile("[ ]");
private StringBuilder gramBuilder= new StringBuilder();
#Override
protected void setup(Context context) throws IOException, InterruptedException{
gramLength=context.getConfiguration().getInt(LENGTH, 0);
}
public void map(Text key, Text value, Context context) throws IOException, InterruptedException {
String[] tokens=space_pattern.split(value.toString());
for(int i=0;i<tokens.length;i++){
gramBuilder.setLength(0);
if(i+gramLength<=tokens.length){
for(int j=i;j<i+gramLength;j++){
gramBuilder.append(tokens[j]);
gramBuilder.append(" ");
}
context.write(new Text(gramBuilder.toString()), key);
}
}
}
}
public static class OutputReducer extends Reducer<Text, Text, Text, Custom> {
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException {
for (Text val : values) {
context.write(key,new Custom(val.toString(),1));
}
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
conf.setInt(LENGTH, Integer.parseInt(args[0]));
conf.setInt(THRESHOLD, Integer.parseInt(args[1]));
// Setup first MapReduce phase
Job job1 = Job.getInstance(conf, "WordOrder-first");
job1.setJarByClass(Detector.class);
job1.setMapperClass(NGramMapper.class);
job1.setReducerClass(OutputReducer.class);
job1.setMapOutputKeyClass(Text.class);
job1.setMapOutputValueClass(Text.class);
job1.setOutputKeyClass(Text.class);
job1.setOutputValueClass(Custom.class);
job1.setInputFormatClass(WholeFileInputFormat.class);
FileInputFormat.addInputPath(job1, new Path(args[2]));
FileOutputFormat.setOutputPath(job1, new Path(args[3]));
boolean status1 = job1.waitForCompletion(true);
if (!status1) {
System.exit(1);
}
}
}
When I compile the code to a class file i get this error:
Detector.java:147: error: non-static variable this cannot be referenced from a static context
context.write(key,new Custom(val.toString(),1));
I followed differents tutorials about custom writable and my solution is the same as the others. Any suggestion?
Static fields and methods are shared with all instances. They are for values which are specific to the class and not a specific instance. Stay out of them as much as possible.
To solve your problem, you need to instantiate an instance (create an object) of your class so the run-time can reserve memory for the instance; or change the part you are accessing it to have static access (not recommended!).
The keyword this is for referencing something that's indeed an instance (hence the this thing) and not something that's static, which in that case should be referenced by the class name instead. You are using it in a static context which is not allowed.

Externalize a final instance variable

Here's my sample code:
public class ExternalizableClass implements Externalizable
{
final int id;
public ExternalizableClass()
{
id = 0;
}
public ExternalizableClass(int i)
{
id = i;
}
#Override
public void writeExternal(ObjectOutput out) throws IOException
{
out.writeInt(id);
}
#Override
public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException
{
id = in.readInt();
}
#Override
public String toString()
{
return "id: " + id;
}
}
It fails to compile because id = in.readInt(); gives Error:(36, 5) java: cannot assign a value to final variable id. However, I can think of real use cases where an immutable field, such as id, needs to be externalized, while we also want to preserve its immutability.
So what's the correct way to resolve this issue?
The read function doesn't make much sense with the idea of a final field, because whatever value it was initialized to should be its value, forever. The read function shouldn't be able to change it.
Clearly objects initialized with the public ExternalizableClass(int i) constructor shouldn't be able to read a new value - if they can then their id value isn't really final. The only other way I could see doing this is making the default constructor initialize an "unread" instance, allowing you to call read on it later. This does, however, require removing the final modifier and working around that. So that would look like this:
public class ExternalizableClass implements Externalizable
{
private int id;
private boolean initted;
int getId(){
return id;
}
public ExternalizableClass(int i, boolean initted){
id = i;
this.initted = initted;
}
public ExternalizableClass(){
this(0, true); //Default instances can't be changed
}
public ExternalizableClass(int i)
{
this(i, true); //Instances from this constructor can't be changed either
}
#Override
public void writeExternal(ObjectOutput out) throws RuntimeException, IOException
{
if(! initted)
throw new RuntimeException("Can't write unitialized instance, " + this);
out.writeInt(id);
}
#Override
public void readExternal(ObjectInput in) throws RuntimeException, IOException, ClassNotFoundException
{
if(initted)
throw new RuntimeException("Can't Read into already initialized object ," + this);
id = in.readInt();
initted = true;
}
#Override
public String toString()
{
if(initted) return "id: " + id;
else return "No id";
}
}

Serialize generic field from java object to json

I've a generic field in User.java. I want to use the value of T in json.
public class User<T> {
public enum Gender {MALE, FEMALE};
private T field;
private Gender _gender;
private boolean _isVerified;
private byte[] _userImage;
public T getField() { return field; }
public boolean isVerified() { return _isVerified; }
public Gender getGender() { return _gender; }
public byte[] getUserImage() { return _userImage; }
public void setField(T f) { field = f; }
public void setVerified(boolean b) { _isVerified = b; }
public void setGender(Gender g) { _gender = g; }
public void setUserImage(byte[] b) { _userImage = b; }
}
and mapper class is:
public class App
{
public static void main( String[] args ) throws JsonParseException, JsonMappingException, IOException
{
ObjectMapper mapper = new ObjectMapper();
Name n = new Name();
n.setFirst("Harry");
n.setLast("Potter");
User<Name> user = new User<Name>();
user.setField(n);
user.setGender(Gender.MALE);
user.setVerified(false);
mapper.writeValue(new File("user1.json"), user);
}
}
and the json output is :
{"field":{"first":"Harry","last":"Potter"},"gender":"MALE","verified":false,"userImage":null}
In the output, i want Name to be appeared in place of field. How do i do that. Any help?
I think what u ask is not JSON's default behavior. Field name is the "key" of the json map, not the variable name. U should rename the field or make some String process to do it.
private T field;
change the above to this:
private T name;
You need a custom serializer to do that. That's a runtime data transformation and Jackson has no support for data transformation other than with a custom serializer (well, there's wrapping/unwrapping of value, but let's not go there). Also, you will need to know in advance every type of transformation you want to apply inside your serializer. The following works:
public class UserSerializer extends JsonSerializer<User<?>> {
private static final String USER_IMAGE_FIELD = "userImage";
private static final String VERIFIED_FIELD = "verified";
private static final String FIELD_FIELD = "field";
private static final String NAME_FIELD = "name";
#Override
public void serialize(User<?> value, JsonGenerator jgen, SerializerProvider provider) throws IOException,
JsonProcessingException {
jgen.writeStartObject();
if (value.field instanceof Name) {
jgen.writeFieldName(NAME_FIELD);
} else {
jgen.writeFieldName(FIELD_FIELD);
}
jgen.writeObject(value.field);
jgen.writeStringField("gender", value._gender.name());
jgen.writeBooleanField(VERIFIED_FIELD, value._isVerified);
if (value._userImage == null) {
jgen.writeNullField(USER_IMAGE_FIELD);
} else {
jgen.writeBinaryField(USER_IMAGE_FIELD, value._userImage);
}
jgen.writeEndObject();
}
}

java - reflection: How to Override private static abstract inner class method?

I have the following class:
class MyClass{
private static final int VERSION_VALUE = 8;
private static final String VERSION_KEY = "versionName";
public boolean myPublicMethod(String str) {
try {
return myPrivateMethod(str, VERSION_KEY, VERSION_VALUE,
new MyInnerClass() {
#Override
public InputStream loadResource(String name) {
//do something important
}
});
}
catch (Exception e) {
}
return false;
}
private boolean myPrivateMethod(String str, String key, int version,
ResourceLoader resourceLoader) throws Exception
{
//do something
}
private static abstract class MyInnerClass {
public abstract InputStream loadResource(String name);
}
}
I want to write unit test for myPrivateMethod for which I need to pass resourceLoader object and override it's loadResource method.
Here is my test method:
#Test
public void testMyPrivateMethod() throws Exception {
Class<?> cls = Class.forName("my.pack.MyClass$MyInnerClass");
Method method = cls.getDeclaredMethod("loadResource", String.class);
//create inner class instance and override method
Whitebox.invokeMethod(myClassObject, "testValue1", "testValue2", "name1", 10, innerClassObject);
}
Note, that I can't change code.
Well, you could use Javassist...
See this question. I haven't tried this, but you can call this method when you want the override:
public <T extends Object> T getOverride(Class<T> cls, MethodHandler handler) {
ProxyFactory factory = new ProxyFactory();
factory.setSuperclass(cls);
factory.setFilter(
new MethodFilter() {
#Override
public boolean isHandled(Method method) {
return Modifier.isAbstract(method.getModifiers());
}
}
);
return (T) factory.create(new Class<?>[0], new Object[0], handler);
}
Well, the problem i see with your code is that you are calling myPublicMethod and you are giving fourth parameter as new MyInnerClass(). Now in your private method fourth parameter is given as ResourceLoader and from your code i see no relation between MyInnerClass and ResourceLoader. So you can try out following code. It might help.
Despite your warning that you cannot change the code i have changed it because i was trying to run your code.
class MyClass{
private static final int VERSION_VALUE = 8;
private static final String VERSION_KEY = "versionName";
public boolean myPublicMethod(String str) {
try {
return myPrivateMethod(str, VERSION_KEY, VERSION_VALUE,
new MyInnerClass() {
#Override
public InputStream loadResource(String name) {
return null;
//do something important
}
});
}
catch (Exception e) {
}
return false;
}
private boolean myPrivateMethod(String str, String key, int version,
MyInnerClass resourceLoader) throws Exception
{
return false;
//do something
}
private static abstract class MyInnerClass {
public abstract InputStream loadResource(String name);
}
}
Hope it helps.

Categories

Resources