dxf, CAM and other comments

on 2003-10-27 21:45:10 UTC

Don't read this if you could care less about .dxf files and how they get
turned into G-codes. A lot of this may be very, very basic stuff to many
of you I suspect. But there may be some others who may, as I am, be
curious about the whole process.

If this is too far "OFF TOPIC" please let me know. If anyone else wants
to jump in, you can download the .pdf files describing .dxf format like
I did from the AutoDesk site, create an "empty file" and "test files"
(described here later) and let me know what you find out.

There are a lot of interesting aspects of the whole CAD/CAM arrangement
which one does not have to investigate in order to "make stuff". This is
why we buy software to do certain things for us. As I have been
investigating, the most interesting and most complicated part of the
process seems to be the translation of the CAD (in most cases .dxf
files) to G-codes. This also seems to be the most misunderstood. We all
understand our drawing (CAD) programs to one extent or another and we
all pretty much understand G-codes (at least the basic ones) ... but how
does the "magic" which changes a .dxf ASCII text file into a G-code
ASCII text file which can be fed happily to our
"read-my-Gcode-and-make-my-motors-go" program happen.

There isn't a lot of discussion about how this "magic" happens. Most
discussions are like, "well I use this black-box and the .dxf goes in
here and the G-code comes out here". I also see people continually
confusing CAM programs with the program which takes G-codes and makes
the motors run with CAM. Taking G-codes and making the motors run is not
really what I consider CAM. CAM is Computer Aided Manufacturing and the
computer is not aiding you a whole lot in "manufacturing" if it's just
taking a G-code file and making your table move around. The real "aid"
that we're all after is "getting from the drawing to the G-code". In
other words, getting from a virtual imaginary frame on the monitor to
the real world of "making chips" taking into account that in one place I
have pixels on a screen and in the other I have a whirling sharp thing
which can make a mess of things if the drawing to G-code translation
doesn't go well. What does it mean to "go well"? Things like if the
G-code translator doesn't take into account the fact that you are using
a 1/4 milling cutter and not a "virtually dimensionless imaginary line
cutter" you don't get a pocket which is 2" square but one that is
"somewhat less than that". And many other decisions that have to be made
by the CAM program as it translates your drawing to machine motions
commands. This is the "real CAM" ... all these decisions which need to
be made intelligently in order to "aid" us in manufacturing what it is
we are drawing. The G-code to table motion software is "machine
control". I suppose you could quibble over semantics but I've seen a lot
of situations where people putting together systems think they have CAM
when all they have is "machine control". In fact, I would go as far as
to say it's probably THE most common misconception I seen in all the
various CNC groups I read and the least understood issue.

As I said, you don't have to understand all this to "make stuff". But I
was curious.

The first part of the process has to be understanding how a .dxf file
records information about all the wonderful stuff you can draw. If you
and I were talking about a drawing conversationally I would say
something like, "start at the bottom left corner of the paper and
measure up 4" then over to the right 3" and make a dot then put a
compass on the dot set for a two inch radius and draw a circle". Well
.dxf files are "sort of" like that but the good folks ate AutoCAD needed
to come up with a drawing description system that could encompass a huge
number of variables in terms of the drawing environment (color, line
weight, layers, etc, etc.). If you look at a .dxf file (which you can by
opening it with any text editor) there is a huge amount of "stuff"
which is not strictly of interest if all you want to do is parse out
tool paths ex: "this is a line and it starts here and ends there". The
more flexible you make a system, the more complicated it has to be to
incorporate all the variables. That's why, if you've programmed before,
you know you might spend 1 hour on the code that does "X" and 10 hours
more coding making sure that the USER can do whatever he wants with "X"
in terms of inputs and conditions and flavors and intentions for the end
use of "X" etc.

In order to ignore the "lets cover everything in the world of drawing
stuff" and just parse out the "here's how the stuff you draw is
described and recorded" I created an empty file in my CAD program and
exported it as a .dxf file establishing the "base line" and then created
some simple files (a line, a circle, a rectangle, etc) and exported
them. I used a "compare" function in my text editor to find what was
different between the empty and the test file (typical reverse
engineering procedure) and made some comments on the results.

Below I list the differences and make some notes and observations. Don't
misconstrue this as some type of tutorial on .dxf files, it's just a
hour or two of poking around and trying to understand what's going on.
You might be interested, you might not. You might know a whole lot more
about this than I've been about to find in my poking.

IF SO I'D LOVE TO HEAR FROM YOU and would appreciate any pointers to
information on understanding .dxf files. The stuff from AutoCAD
"assumes" you know a lot about their world already ... I don't. I use a
program called VECTORWORKS on a Mac ... very "outside" their world.

The second part of the process after understanding how basic graphic
elements are recorded in the .dxf file is to understand what goes into
converting them to machine movement descriptions aka G-code. How well
this is done and how many specific G-code "built in" functions can be
employed in the process is the difference between a CAM program which
costs $0.00 and one that costs $1000.00.

Included in this understanding would be things like tool offsets,
backlash compensation adjustments to G-code values, lead-in/lead-out
etc.

Like I said, I wanted to know how dxf files work. So I went to the
AutoDesk site and downloaded the .pdf file for version 2002.

AUTOCAD 2002 DXF REFERENCE GUIDE

The first thing to know about .dxf files is they come in "many flavors".
The lines in your .dxf file which describe "what flavor" or release
version of AutoCAD (or what version rules some other drawing program
which uses the .dxf format used to create it's .dxf files) are at the
beginning in the header section.

Also there is a lot of terminology I don't fully understand in terms of
defining the parts of the dxf file. There are blocks, sections,
entities, objects, group codes, classes, layers, etc. ... I haven't
sorted all these out just yet (again jump in if you care to).

The line numbers down the left are NOT part of the .dxf file. I added
them in my text editor for clarity

These lines describe the version of AutoCAD dxf format

5 9
6 $ACADVER
7 1
8 AC1015 <--- I'm exporting in 2000 format (which is also 2002 and 2000i)

Version identifier table

$ACADVER AutoCAD release
AC1018 2004 <--- the people at AutoDesk are obviously time travelers
AC1015 2002, 2000i, 2000
AC1014 14
AC1012 13
AC1009 12, 11
AC1006 10
AC1004 9
AC1002 2.6
AC1.50 2.05

Notes:

* AutoCAD 2000, 2000i, and 2002 all have the same identifier because the
drawing format is fully compatible in these releases.

* If you are editing a DXF file in a text editor, you must save the file
in ASCII text format.

* The AutoCAD Release 14 Customization Guide states the $ACADVER for
Release 14 is AC1013.
This is a documentation error and is corrected in the Help update.

* The AutoCAD Release 14 sample drawings contain the version identifier
AC1013.

* AutoCAD Release 11 and 12 have the same identifier because Release 12
was backward compatible with Release 11.

So, as I said, to find out how these suckers are put together I generated a
empty file in VectorWorks (hence refered to as VW) and exported it in
AutoCAD dxf 2000 flavor. Then I generated other files to compare it to. A
single line, a single circle, a single
rectangle, etc.

--------These lines all differ in the header part -------------------------

and they will always be different because they are the time/date stamp for
when the file was saved.

446 $TDCREATE <-----$ prefix indicates a "system" variable TIME DATE
CREATE
447 40 <--group code
448 2452940.713969907 <-----.pdf manual describes how this date/time is
derived mathematically
I don't really care at this point.

This next lines that differ are still a bit of a mystery (jump in if you
know!)

Described in the documentation as: "Next available handle"
A DXF system variable but otherwise not sure what it is or does.
The two files I created after "empty.dxf" have 2F here. My guess is
that as you create objects/entities "handles" are assigned in
sequence. How these are used I don't know exactly. If it's a single
byte HEX number that would seem to limit the number of "handles" to 256
which certainly doesn't seem right for complicated drawings unles they're
reused or something.

506 $HANDSEED <----we need to "seed" the handle id with a start number
507 5 <----group code saying handle is coming up
508 2E <----why 2E to start ... no idea

Now until you get down about another 1500 lines there is no difference
between an empty
file and one with a single line drawn in it. So what is all this code in
between. Well
things like what the world coordinate system looks like, and what units are
you using,
and a LOT of other stuff which I'm not interested in at the moment.

What I want to know is "where are my line, circle, rectangle and their
dimensions stored?"

Again the line numbers I added for ease of reference they do not appear in
the dxf files

This describes the single line test (i.e. these are the lines which are
different
than in the "empty" file). One would assume that a file which contained 500
lines
would look the same as this ... just more-ish

2123 0 <---this is a group code, specifically it indicates that
the
next line contains a string describing the entity type
2124 SECTION <---in this case a "section"
2125 2 <---this is the group code for a name "attribute tag, block
name and so on"
2126 ENTITIES <---ENTITIES are graphic elements (OBJECTS are not)
2127 0 <---this is a group code, specifically it indicates that
the
next line contains a string describing the entity type
2128 LINE <---hey a "line" ... my line .. now we're talking, we're
2000 lines of
code into the .dxf file and now my line appears hurrah!
2129 5 <---a "common group code for entities" this means "handle"
assignment is coming up and since this is the first
2130 2E "handle" created it is the value set in $HANDSEED (see
above) one would guess the next entity would get
a handle 2F, etc. One would also guess handles are used
to move things around, align things, etc. .... I don't
know.
2131 100 <---subclass marker (AcDbLine) This is a "block group
code" coming next
2132 AcDbEntity <---AcDbEntity is a "class" ... this describes "things
about the line which is to follow what layer, what
weight, what color (and from the color (neg or pos)
you can tell if the layer is on
2133 8 <---Layer name coming up
2134 Layer-1 <---This entity belongs to "Layer-1"
2135 370 <---Lineweight enum value stored and moved around as a 16
bit integer
2136 5 <----default in VW if I change to 40pt in VW this =
100. I'm not sure of the exact relationship
between line thickness in
VW and lineweight in dxf yet. Thicker in VW =
bigger number here
2137 6 <---This is a layer group code and this one says linetype
coming up
2138 Continuous <---This is a linetype, I think it just means a
start-to-finish line which is uninterrupted i.e. not
dashed etc. just goes from here to there
2139 62 <---Color of the line follows (if negative then this layer
is turned off)
2140 7 <----black line? default in VW (reminder
VW=VectorWorks my CAD program)
2141 100 <---subclass marker (AcDbLine) This is a "block group
code"
2142 AcDbLine <---AcDbLine is a "class" and what follows tells us about
the line

The numeric values given below are in WCS (World Coordinate System) which is
based on how the software which made the file views the "world" in terms of
a cartisian coordinate system. Where is 0,0,0 origin defined as you start
adding stuff to it or around it.

2143 10 <---first corner in WCS x value coming up
I never thought of a "line" having a corner
but ... hey
2144 -2.75 <---this is the x origin of the line
2145 20 <---first corner in WCS y value coming up
2146 0.0 <---this is the y origin of the line
2147 30 <---first corner in WCS z value coming up
2148 0.0 <---this is the z origin of the line
2149 11 <---second corner in WCS x value coming up
2150 -0.3749999999999999 <---this is the x value of the end of the line
2151 21 <---second corner in WCS y value coming up
2152 0.0 <---this is the y value of the end of the line
2153 31 <---second corner in WCS z value coming up
2154 0.0 <---this is the z value of the end of the line

It's not important what the values are here since you can manipulate them
mathematically relative to the WCS to find out how long the line is and
where it is, etc

Because I just plopped it on the page the values above are what they are,
if I use the SET ORIGIN function in my CAD program to make the left end or
start point of my line equal to the "ORIGIN" then the values above would
be 0,0,0 x,y,z

2150 2.375
2152 0.0 the "second corner" or end of the line
2154 0.0

2155 0 <---this is a group code, specifically it indicates that
the next line contains a string describing the entity
type
2156 ENDSEC <---in this case a "end of section"

So if I were looking to find all the simple lines in my .dxf file I could
search for the AcDbLine "class" and find the start and end points of the
line and of course it's length by looking at the 10,20,30 codes. That
was easy.

This next bit is from comparing the "empty" file with a file containing a
single circle (again just plopped down on the page).

From 2123 to 2141 see the comments above about handle number, linewweight,
color, layer, etc.

It it perhaps worth mentioning (and perhaps painfully obvious) that
things like lineweight, color and layer don't really have those
meanings when a G-code file is created. There is no G-code for
"now mill this in blue" ... but these values can be used (if we assume an
understanding on the part of the draftsman and the G-code conversion
program as to what they will stand for) as "keys" to doing "real world"
stuff like separating out machining functions.

If the line is "blue" use this diameter cutter, if the layer is #2 then
that's where my drilling functions reside, etc.

Again for what this next section of uncommented code does, refer to the
commments above.

2123 0
2124 SECTION
2125 2
2126 ENTITIES
2127 0
2128 CIRCLE
2129 5
2130 2E
2131 100
2132 AcDbEntity
2133 8
2134 Layer-1
2135 370
2136 5
2137 6
2138 Continuous
2139 62
2140 7
2141 100 <---ok here's our block group code
2142 AcDbCircle <---and this tells us a circle entity is coming up
2143 10 <---here's the old 10,20,30 x,y,z but since this is a
circle, this describes the center point
2144 -2.0
2145 20
2146 2.125
2147 30
2148 0.0
2149 40 <---ah ... and here's something new, code 40 which
says "radius to follow"
2150 1.0702722283113835 <---the radius of the circle
2151 0 <---and the usual "end of section"
2152 ENDSEC

So if I were looking for all the circles in my .dxf file I would search for
AcDbCircle and then parse the center point and radius from the above.

The next bit describes the comparison between "empty file" and the test file
with the simple rectangle.

From 2123 to 2141 see the comments above about handle number, linewweight,
color, layer

2123 0
2124 SECTION
2125 2
2126 ENTITIES
2127 0
2128 POLYLINE
2129 5
2130 2E
2131 100
2132 AcDbEntity
2133 8
2134 Layer-1
2135 370
2136 5
2137 6
2138 Continuous
2139 62
2140 7

Now things get a bit more interesting but not really difficult

2141 100
2142 AcDb2dPolyline <---here's our new "thing" the "polyline" which makes
sense why would you bother to come up with a way to
describe a closed path 4 sided object when you
can describe more generally any group of connected
lines no matter how many are involved and it doesn't
matter if the last one connects to the first one
or not. You could use this approach to describe a
lightning bolt design or an octagon. So unlike LINE
and CIRCLE there is no "RECTANGLE" or "SQUARE"
also if you "explode" the rectangle (or whatever
your CAD program uses to describe this function
SMASH, BASH, DEMOLISH, etc. .. mine calls it
"CONVERT TO LINES") before exporting the file you
won't get a ployline sequence you'll get 4 LINE
entity descriptions instead.

2143 66 <---apparently as of AutoCAD 2002 this is obsolete
and we are to ignore it and the operand which
follows it and get right on with the first line
2144 1 <---ok ... I'm ignoring you
2145 10 <---here starts the first 10,20,30 xyz set describing
the first line
2146 0.0 <---the docs say this is always 0 ... what!!!??? so how
do I know where it is in the WCS? probably has to
do with VERTEXs which are coming up
2147 20
2148 0.0 <---y always 0
2149 30
2150 0.0 <---z always 0
well that wasn't much help with my rectangle yet

2151 70 <--group code for polyline flag which is "bit coded"
meaning the numbers here will be 1, 2, 4, 8, 16,
32, 64 or 128 (bit position in binary)
2152 1 <--a "1" means its a closed polyline (or "polygon
mesh closed in the M direction" which means
nothing to me)
The fact that it is closed means we don't get
an "ending vertex which describes the same
point as the beginning ... this is assumed
because it is "closed" and you have to connect
the dots (pun intended)
2153 75 <--curves and smooth surface type
2154 5 <--this code says this is a Quadratic B-spline
surface I think this is because my CAD
program considers everything to be made of
B-Splines of one sort or another. For those
of you familiar with ADOBE Illustrator the
pen creates B-Splines which can be straight
lines also as they are here.
2155 0 <---here's the "get ready for an entity type string
2156 VERTEX <---and here it is the first VERTEX i.e. the start point
of my rectangle
2157 5 <---of course it has to be assigned a handle
2158 2F
2159 100
2160 AcDbEntity <---and we need to know the usual about layer, color,
and linetype ... I guess ... didn't we do this
above right before "AcDb2dPolyline"
anyway ... 2161-2168 are the typical descriptions
2161 8
2162 Layer-1
2163 370
2164 5
2165 6
2166 Continuous
2167 62
2168 7
2169 100 <---subclass marker AcDb2dPolyhline or AcDb3dPolyline
2170 AcDbVertex <---hey it's a vertex
2171 100 <---subclass marker AcDb2dPolyhline or AcDb3dPolyline
2172 AcDb2dVertex <---hey it's a 2d vertex in this case
2173 10 <---ok here's the 10,20,30 codes x,y,z of the vertex
2174 0.0
2175 20
2176 2.0
2177 30
2178 0.0

So after all that I know where I go to start to draw the first line of
my rectangle (or "rather closed polyline"). So now I need to know where
to draw it "to" ... I need another vertex.

2179 0 <---so here it is along with the lineweight, linetype,
color info, etc.
2180 VERTEX
2181 5
2182 30
2183 100
2184 AcDbEntity
2185 8
2186 Layer-1
2187 370
2188 5
2189 6
2190 Continuous
2191 62
2192 7
2193 100
2194 AcDbVertex
2195 100
2196 AcDb2dVertex
2197 10 <-----here's the 10,20,30 codes with the x,y,z
location
2198 2.5
2199 20
2200 2.0
2201 30
2202 0.0

2203 0 <---and the next one
2204 VERTEX
2205 5
2206 31
2207 100
2208 AcDbEntity
2209 8
2210 Layer-1
2211 370
2212 5
2213 6
2214 Continuous
2215 62
2216 7
2217 100
2218 AcDbVertex
2219 100
2220 AcDb2dVertex
2221 10
2222 2.5
2223 20
2224 0.0
2225 30
2226 0.0

2227 0 <---and the last one (remember from here because this
polyline has been identified as "closed" I know to
draw a line (toolpath) back to the first vertex
2228 VERTEX
2229 5
2230 32
2231 100
2232 AcDbEntity
2233 8
2234 Layer-1
2235 370
2236 5
2237 6
2238 Continuous
2239 62
2240 7
2241 100
2242 AcDbVertex
2243 100
2244 AcDb2dVertex
2245 10
2246 0.0
2247 20
2248 0.0
2249 30
2250 0.0

2251 0 <---entity coming
2252 SEQEND <---sequence end ... this sequence of vertices is over with

2253 5 <---assign a handle ... but to what?
<---I suppose this next bit just kinda tidies up a bit after
the polyline, maybe it sets things back to the way they
were before the polyline ... don't know
2254 33
2255 100
2256 AcDbEntity
2257 8
2258 Layer-1 <--layer
2259 370
2260 5 <--line weight
2261 6
2262 Continuous <--line type
2263 62
2264 7 <--color
2265 0
2266 ENDSEC

So that's what I know so far. Of course if you want to stay very basic,
thats all you need to know. Everything can be drawn with polylines. A
curve can be turned into "polylines" by a function in most CAD programs.
As I mention above it typical has some violent descriptor (SMASH, BASH,
EXPLODE,etc.) Therefore a curve can be converted (reduced, if you will)
into a lot of little short straight lines which approximate the path of
the curve. A circle, being a curve, can be treated the same way, as can
an arc, etc. All you really need are polylines. Of course the "smoothness"
of your curve will depend on how many little straight lines your drawing
program converts it to. A lot makes for a smoother approximation, and a
few makes the curve more like a jagged line. You can generally tell your
CAD program how short to make these little straight lines when it converts
curve to polylines. You can imagine certain drawings that when SMASHED
produce literally thousands of polylines, which then convert to thousands
of G-code set to make those lines and sometimes this can be too much for
the particular software you are using to control your machine (buffer
overflow ---- or "this is just too much crap for me to deal with syndrome")
and the program will crash or hang.

Well that's what I know so far. Obviously this is all 2D, things get a lot
more complicated when you add the 3rd dimension I suppose.

Now the next step is how do we parse out the lines, polylines and circles
and make decision about "tool paths". Since toolpaths involve moving
stuff in space time we'd like to be efficient. If I draw a star, for
example, I don't want the my cutter always return to the center of the star
before it cuts one of the points. I want the toolpath to cause the cutter to
cut the items closest to where it's at currently to eliminate having to
move all over the place trying to emulate the way the drawing was created.
So there has to be some tables made of dimensions and locations and some
sorting and ordering of these tables before I can create efficient G-code.

Cheers,

Ken Jenkins
kjenkins@...

Discussion Thread

Ken Jenkins 2003-10-27 21:45:10 UTC dxf, CAM and other comments Mariss Freimanis 2003-10-27 22:34:32 UTC Re: dxf, CAM and other comments Carlos Guillermo 2003-10-27 22:45:49 UTC RE: [CAD_CAM_EDM_DRO] dxf, CAM and other comments Ken Jenkins 2003-10-28 06:26:14 UTC Re: dxf, CAM and other comments