(I'm using Java therefore added the 'Java' tag incase it influences any answers, however some may argue that the tag is unecessary.)
Consider the following:
I have a veranda/balcony to plan graphically. As part of the final plan I am required to list any materials required to construct the veranda. To simplify the calculating process I've split the veranda into different Sections:
Above Deck - which involves the plastic fencing components and fixings (i.e Screws, bolts)
Below Deck - which involves timber subframes and more fixings.
...Plus others which I won't go in depth.
For arguments sake: There are 100 different stock items available due to the different fencing designs we offer. Each item has its own class including information such as dimensions, colour, quantity and its ID reference.
What I would like to do is create a list of fixings required for each section and add them all up into one final list without any duplications.
The final idea is to display the item name and a quantity next to the processed plan.
So for example:
Section 1 requires:
- 25 x Screw A
- 15 x Screw B
- 5 x Screw C
Section 2 requires:
- 25 x Screw A
- 15 x Screw B
- 5 x Screw F
Section 3 requires:
- 45 x Screw B
- 50 x Screw C
- 24 x Screw G
Total List
Section 1 + Section 2 + Section 3 = {Complete list of materials}
What I've tried/considered:
All the screws I've mentioned above have their own class and extend the super class "Fasteners". Each Section mentioned above has a variable:
ArrayList<Fasteners>
So as I'm calculating I can add the screws to this variable. Once each section has calculated how many Fasteners it requires I then add them altogether.
I've considered making a "add(Fixings)" method in the "Fasteners" super class which adds any duplicate items together. But because there are 100 items I would of thought the coding wouldn't be very efficient and suspect there is a better way of making use of the polymorphism I've set up here. Any online references or hints would be helpful and very much appreciated.
Here's a suggestion:
Make an enum called FastenerType, then use an EnumMap<FastenerType, AtomicInteger> to hold your fasteners. Let's call it the bag.
Then do something like
public class FastenerBag {
private Map<FastenerType, AtomicInteger> fasteners = new EnumMap<FastenerType, AtomicInteger>(FastenerType.map);
FastenerBag(){
// init your ints to zero
}
public void addFastenersFor(VerandaPart part){
for (FastenerType type : FastenerType.values()){
fasteners.get(type).addAndGet(part.getRequiredFasteners().get(type));
}
}
}
Don't worry too much about performance. First get it working. Then get it readable (refactor), then optimize.
Related
I have 2 strings in an array. I want there to be a 10% chance of one and 90% chance to select the other. Right now I am using:
Random random = new Random();
int x = random.nextInt(100 - 1) + 1;
if (x < 10) {
string = stringArray(0);
} else {
string = stringArray(1);
}
Is this the best way of accomplishing this or is there a better method?
I know it's typically a bad idea to submit a stack overflow response without submitting code, but I really challenge this question of " the best way." People ask this all the time and, while there are established design patterns in software worth knowing, this question almost always can be answered by "it depends."
For example, your pattern looks fine (I might add some comments). You might get a minuscule performance increase by using 1 - 10 instead of 1 - 100, but the things you need to ask yourself are as follows :
If I get hit by a bus, is the person who is going to be working on the application going to know what I was trying to do?
If it isn't intuitive, I should write a comment. Then I should ask myself, "Can I change this code so that a comment isn't necessary?"
Is there an existing library that solves this problem? If so, is it FOSS approved (if applicable) / can I use it?
What is the size of this codebase eventually going to be? Am I making a full program with microservices, a DAO, DTO, Controller, View, and different layers for validation?
Is there an existing convention to solve my problem (either at my company or in general), or is it unique enough that I can take my own spin on it?
Does this follow the DRY principle?
I'm in (apparently) a very small camp on stack overflow that doesn't always believe in universal "bests" for solving code problems. Just remember, programming is only as hard as the problem you're trying to solve.
EDIT
Since people asked, I'd do it like this:
/*
* #author DaveCat
* #version 1.0
* #since 2019-03-9
* Convenience method that calculates 90% odds of A and 10% odds of B.
*
*/
public static String[] calculatesNinetyPercent()
{
Random random = new Random();
int x = random.nextInt(10 - 1 ) + 1
//Option A
if(x <= 9) {
return stringArray(0);
}
else
{
//Option B
return stringArray(1);
}
}
As an aside, one of the common mistakes junior devs make in enterprise level development is excessive comments.This has a javadoc, which is probably overkill, but I'm assuming this is a convenience method you're using in a greater program.
Edit (again)
You guys keep confusing me. This is how you randomly generate between 2 given numbers in Java
One alternative is to use a random float value between 0..1 and comparing it to the probability of the event. If the random value is less than the probability, then the event occurs.
In this specific example, set x to a random float and compare it to 0.1
I like this method because it can be used for probabilities other than percent integers.
So I realize this text format may not be very unusual. However, I've been trying many ideas to read this correctly into the objects needed and know there has to be a better way. Here is what the file looks like:
S S n
B 1 E 2
B N n
C 2 F 3
C N n
D 2 GA 4
D N n
GA 1
E N n
B 1 F 3 H 6
F N n
I 3 GA 3 C 1
A N g
H B b
U 2 GB 2 F 1
I N n
GA 2 GB 2
GB N g
So the first line of each pair is the name of the node S, whether its a starting node N/s then whether its a goal node g/n
The second line is the children of node S, and their distance weight.
For example, Node S has child node B with a distance of 1, and child node E with a distance of 2
I'm working with two object types, Nodes, and Edges.
Example below. (Constructors ommitted)
Does anyone have tips on how to read this type of input efficiently?
public class Edge {
public String start;
public String end;
public int weight;
public class Node {
public String name;
public boolean start = false;
public boolean goal = false;
public ArrayList<Edge> adjecentNodes = new ArrayList<Edge>();
Actually your question is almost too broad and unspecific, but I am in the mood to give you some starting points. But please understand that you could easily fill several hours of computer science lectures on this topic; and that is not going to happen here.
First, you have to clarify some requirements (for yourself; or together with the folks working on this project):
Do you really need to focus on efficient reading/building of your graph? I rather doubt that: building the graph happens once; but the computations that you probably do later on may run for a much longer time. So one primary focus should be on designing that object/class model that allows you to efficiently solve the problems that you want to solve on that graph! For example: it might be beneficial to already sort edges by distance/weight when creating the graph. Or maybe not. Depends on later use cases! And even when you are talking about huge files that need efficient processing ... that still means: you are talking about huge graphs; so all the more reason to find a good model for that graph.
Your description of the file is not clear. For example: is this a (un)directed graph? Meaning - can you travel on any edge in both direction? And sorry, I didn't get what a "goal" node is supposed to be. (I guess you have directed edges that go one way only, as that would explain those rows in the example where nodes do not have any children). Of course, sometimes requirements become clear in that moment when you start writing real code. But this here is really about concepts/data/algorithms. So the earlier you answer all such questions, the better for you.
Secondly, a suggestion in which order to do things:
As said, clarify all your requirements. Spend some serious time just thinking about the properties of the graphs you are dealing with; and what problems you later have to solve on them.
Start coding; ideally you use TDD/unit testing here. Because all of the things you are going to do can be nicely sliced into small work packages, and each one could be tested with unit-tests. Do not write all your code first, to then, after 2 days running your first experiments! The first thing you code: your Node/Edge classes ... because you want to play around with things like: what arguments do my constructors need? how can I make my classes immutable (data is pushed in by constructors only)? Do I want distance to be a property of my Edge; or can I just go with Node objects (and represent edges as Map<Node, Integer> --- meaning each node just knows its neighbors and the distance to get there!)
Then, when you are convinced that that Node/Edge fit your problem, then you start writing code that takes strings and builds Node/Edges out of those strings.
You also went to spent some time on writing good dump methods; ideally you call graph.dump() ... and that produces a string matching your input format (makes a nice test later on: reading + dumping should result in identical files!)
Then, when "building from strings" works ... then you write the few lines of "file parsing code" that uses some BufferedReader, FileReader, Scanner, Whatever mechanism to dissect your input file into strings ... which you then feed into the methods you created above for step 3.
And, seriously: if this is for school/learning:
Try to talk to your peers often. Have them look at what you are doing. Not too early, but also not too late.
Really, seriously: consider throwing away stuff; and starting from scratch again. For each of the steps above, or after going through the whole sequence. It is an incredible experience to do that; because typically, you come up with new, different, interesting ideas each time you do that.
And finally, some specific hints:
It is tempting to use "public" (writable) fields like start/end/... in your classes. But: consider not doing that. Try to hide as much of the internals of your classes. Because that will make it easier (or possible!) later on to change one part of your program without the need to change anything else, too.
Example:
class Edge {
private final int distance;
private final Node target;
public Edge(int distance, Node target) {
this.distance = distance; this.target = target;
}
...
This creates an immutable object - you can't change its core internal properties after the object was created. That is very often helpful.
Then: override methods like toString(), equals(), hashCode() in your classes; and use them. For example, toString() can be used to create a nice, human-readable dump of a node.
Finally: if you liked all of that, consider remembering my user id; and when you reach enough reputation to "upvote", come back and upvote ;-)
In my book there is an example which explains the differences between arrays in Java and C.
In Java we can create an array by writing:
int[] a = new int[5];
This just allocates storage space on the stack for five integers and we can access them exactly as we would have done in Java
int a[5] = {0};
int i;
for (i = 0, i < 5; i++){
printf("%2d: %7d\n", i, a[i]);
}
Then the author says the following
Of course our program should not use a number 5 as we did on several places in the example, instead we use a constant. We can use the C preprocessor to do this:
#define SIZE 5
What are advantages of defining a constant SIZE 5?
Using a named constant is generally considered good practice because if it is used in multiple places, you only need to change the definition to change the value, rather than change every occurrence - which is error prone.
For example, as mentioned by stark in the comments, it is likely that you'll want to loop over an array. If the size of the array is defined by a named constant called SIZE, then you can use that in the loop bounds. Changing the size of the array then only requires changing the definition of SIZE.
There is also the question of whether #define is really the right solution.
To borrow another comment, from Jonathan Leffer: see static const vs #define vs enum for a discussion of different ways of naming constants. While modern C does allow using a variable as an array size specifier, this technically results in a variable-length array which may incur a small overhead.
You should use a constant, because embedding magic numbers in code makes it harder to read and maintain. For instance, if you see 52 in some code, you don't know what it is. However, if you write #define DECKSIZE 52, then whenever you see DECKSIZE, you know exactly what it means. In addition, if you want to change the deck size, say 36 for durak, you could simply change one line, instead of changing every instance throughout the code base.
Well, imagine that you create a static array of 5 integer just like you did int my_arr [5]; ,you code a whole programm with it, but.. suddenly you realise that maybe you need more space. Imagine that you wrote a code of 6-700 lines, you MUST replace every occurence of you array with the fixed number of your choice. Every for loop, and everything that is related with the size of this array. You can avoid all of this using the preprocessor command #define which will replace every occurence of a "keyword" with the content you want, it's like a synonymous for something. Eg: #define SIZE 5 will replace in your code every occurence of the word SIZE with the value 5.
I find comments here to be superflous. As long as you use your constant (5 in this case) only once, it doesn't matter where it is. Moreover, having it in place improves readability. And you certainly do not need to use the constant in more than one place - afterall, you should infer the size of array through sizeof operator anyways. The benefit of sizeof approach is that it works seamlessly with VLAs.
The drawback of global #define (or any other global name) is that it pollutes global namespace. One should understand that global names is a resource to be used conservatively.
#define SIZE 5
This looks like an old outdated way of declaring constants in C code that was popular in dinosaur era. I suppose some lovers of this style are still alive.
The preferred way to declare constants in C languages nowadays is:
const int kSize = 5;
I know that -1, 0, 1, and 2 are exceptions to the magic number rule. However I was wondering if the same is true for when they are floats. Do I have to initialize a final variable for them or can I just use them directly in my program.
I am using it as a percentage in a class. If the input is less than 0.0 or greater than 1.0 then I want it set the percentage automatically to zero. So if (0.0 <= input && input <= 1.0).
Thank you
Those numbers aren't really exceptions to the magic number rule. The common sense rule (as far as there is "one" rule), when it isn't simplified to the level of dogma, is basically, "Don't use numbers in a context that doesn't make their meaning obvious." It just so happens that these four numbers are very commonly used in obvious contexts. That doesn't mean they're the only numbers where this applies, e.g. if I have:
long kilometersToMeters(int km) { return km * 1000L; }
there is really no point in naming the number: it's obvious from the tiny context that it's a conversion factor. On the other hand, if I do this in some low-level code:
sendCommandToDevice(1);
it's still wrong, because that should be a constant kResetCommand = 1 or something like it.
So whether 0.0 and 1.0 should be replaced by a constant completely depends on the context.
It really depends on the context. The whole point of avoiding magic numbers is to maintain the readability of your code. Use your best judgement, or provide us with some context so that we may use ours.
Magic numbers are [u]nique values with unexplained meaning or multiple occurrences which could (preferably) be replaced with named constants.
http://en.wikipedia.org/wiki/Magic_number_(programming)
Edit: When to document code with variables names vs. when to just use a number is a hotly debated topic. My opinion is that of the author of the Wiki article linked above: if the meaning is not immediately obvious and it occurs multiple times in your code, use a named constant. If it only occurs once, just comment the code.
If you are interested in other people's (strongly biased) opinions, read
What is self-documenting code and can it replace well documented code?
Usually, every rule has exceptions (and this one too). It is a matter of style to use some mnemonic names for these constants.
For example:
int Rows = 2;
int Cols = 2;
Is a pretty valid example where usage of raw values will be misleading.
The meaning of the magic number should be obvious from the context. If it is not - give the thing a name.
Attaching a name for something creates an identity. Given the definitions
const double Moe = 2.0;
const double Joe = 2.0;
...
double Larry = Moe;
double Harry = Moe;
double Garry = Joe;
the use of symbols for Moe and Joe suggests that the default value of Larry and Harry are related to each other in a way that the default value of Garry is not. The decision of whether or not to define a name for a particular constant shouldn't depend upon the value of that constant, but rather whether it will non-coincidentally appear multiple places in the code. If one is communicating with a remote device which requires that a particular byte value be sent to it to trigger a reset, I would consider:
void ResetDevice()
{
// The 0xF9 command is described in the REMOTE_RESET section of the
// Frobnitz 9000 manual
transmitByte(0xF9);
}
... elsewhere
myDevice.ResetDevice();
...
otherDevice.ResetDevice();
to be in many cases superior to
// The 0xF9 command is described in the REMOTE_RESET section of the
// Frobnitz 9000 manual
const int FrobnitzResetCode = 0xF9;
... elsewhere
myDevice.transmitByte(FrobnitzResetCode );
...
otherDevice.transmitByte(FrobnitzResetCode );
The value 0xF9 has no real meaning outside the context of resetting the Frobnitz 9000 device. Unless there is some reason why outside code should prefer to send the necessary value itself rather than calling a ResetDevice method, the constant should have no value to any code outside the method. While one could perhaps use
void ResetDevice()
{
// The 0xF9 command is described in the REMOTE_RESET section of the
// Frobnitz 9000 manual
int FrobnitzResetCode = 0xF9;
transmitByte(FrobnitzResetCode);
}
there's really not much point to defining a name for something which is in such a narrow context.
The only thing "special" about values like 0 and 1 is that used significantly more often than other constants like e.g. 23 in cases where they have no domain-specific identity outside the context where they are used. If one is using a function which requires that the first parameter indicates the number of additional parameters (somewhat common in C) it's better to say:
output_multiple_strings(4, "Bob", Joe, Larry, "Fred"); // There are 4 arguments
...
output_multiple_strings(4, "George", Fred, "James", Lucy); // There are 4 arguments
than
#define NUMBER_OF_STRINGS 4 // There are 4 arguments
output_multiple_strings(NUMBER_OF_STRINGS, "Bob", Joe, Larry, "Fred");
...
output_multiple_strings(NUMBER_OF_STRINGS, "George", Fred, "James", Lucy);
The latter statement implies a stronger connection between the value passed to the first method and the value passed to the second, than exists between the value passed to the first method and anything else in that method call. Among other things, if one of the calls needs to be changed to pass 5 arguments, it would be unclear in the second code sample what should be changed to allow that. By contrast, in the former sample, the constant "4" should be changed to "5".
I have updated this question(found last question not clear, if you want to refer to it check out the reversion history). The current answers so far do not work because I failed to explain my question clearly(sorry, second attempt).
Goal:
Trying to take a set of numbers(pos or neg, thus needs bounds to limit growth of specific variable) and find their linear combinations that can be used to get to a specific sum. For example, to get to a sum of 10 using [2,4,5] we get:
5*2 + 0*4 + 0*5 = 10
3*2 + 1*4 + 0*5 = 10
1*2 + 2*4 + 0*5 = 10
0*2 + 0*4 + 2*5 = 10
How can I create an algo that is scalable for large number of variables and target_sums? I can write the code on my own if an algo is given, but if there's a library avail, I'm fine with any library but prefer to use java.
One idea would be to break out of the loop once you set T[z][i] to true, since you are only basically modifying T[z][i] here, and if it does become true, it won't ever be modified again.
for i = 1 to k
for z = 0 to sum:
for j = z-x_i to 0:
if(T[j][i-1]):
T[z][i]=true;
break;
EDIT2: Additionally, if I am getting it right, T[z][i] depends on the array T[z-x_i..0][i-1]. T[z+1][i] depends on T[z+1-x_i..0][i-1]. So once you know if T[z][i] is true, you only need to check one additional element (T[z+1-x_i][i-1]) to know if T[z+1][i-1] will be true.
Let's say you represent the fact whether T[z][i] was updated by a variable changed. Then, you can simply say that T[z][i] = changed && T[z-1][i]. So you should be done in two loops instead of three. This should make it much faster.
Now, to scale it - Now that T[z,i] depends only on T[z-1,i] and T[z-1-x_i,i-1], so to populate T[z,i], you do not need to wait until the whole (i-1)th column is populated. You can start working on T[z,i] as soon as the required values are populated. I can't implement it without knowing the details, but you can try this approach.
I take it this is something like unbounded knapsack? You can dispense with the loop over c entirely.
for i = 1 to k
for z = 0 to sum
T[z][i] = z >= x_i cand (T[z - x_i][i - 1] or T[z - x_i][i])
Based on the original example data you gave (linear combination of terms) and your answer to my question in the comments section (there are bounds), would a brute force approach not work?
c0x0 + c1x1 + c2x2 +...+ cnxn = SUM
I'm guessing I'm missing something important but here it is anyway:
Brute Force Divide and Conquer:
main controller generates coefficients for say, half of the terms (or however many may make sense)
it then sends each partial set of fixed coefficients to a work queue
a worker picks up a partial set of fixed coefficients and proceeds to brute force its own way through the remaining combinations
it doesn't use much memory at all as it works sequentially on each valid set of coefficients
could be optimized to ignore equivalent combinations and probably many other ways
Pseudocode for Multiprocessing
class Controller
work_queue = Queue
solution_queue = Queue
solution_sets = []
create x number of workers with access to work_queue and solution_queue
#say for 2000 terms:
for partial_set in coefficient_generator(start_term=0, end_term=999):
if worker_available(): #generate just in time
push partial set onto work_queue
while solution_queue:
add any solutions to solution_sets
#there is an efficient way to do this type of polling but I forget
class Worker
while true: #actually stops when a stop work token is received
get partial_set from the work queue
for remaining_set in coefficient_generator(start_term=1000, end_term=1999):
combine the two sets (partial_set.extend(remaining_set))
if is_solution(full_set):
push full_set onto the solution queue