Finite Global Scheduling Will Not Complete

,

I have Epicor Support working on this now but they are taking a little too long to answer so i figured i can fight this on two fronts.

We are working within pilot as to not screw anything up and just some things to note.

  1. All Resource Groups have a Finite Horizon set (at the resource group level)
  2. ALMOST all Resources have Finite Capacity Checked
  3. Overload Horizon has been set to 180 within the Site Maintenance Menu
  4. i have completed most of our jobs after copying Live to Pilot leavings us with just over 1100 open jobs (some running , some not started)
  5. We run 4 Schedulers and log Process and Scheduling
  6. We have recently run the global scheduling order as well as running some of the User Run Corrections per Epicorā€™s request.

Right, so we were running into a roadblock just like this a few months ago when trying to run Infinite Scheduling. The Epicor support guy finally helped us out by telling us to set the Overload Horizon out to 180. This let the scheduler run all the way through to completion. Well now my boss wants to see what finite would look like. We really want to start using the system for what its meant for. So i went into the scheduler, turned on Finite scheduling and ran the process. It eventually hit a roadblock and wouldnā€™t pass a certain job. I tried this for a few more runs, cancelling the process and completing the job that held up the scheduler before continuing. Eventually i got in touch with Epicor again and they informed me of the Finite Horizon box and Finite Capacity box. I DMTā€™d ā€œ2ā€ into all finite horizon boxes and checked off almost all finite capacities. Still though, the scheduler gets held up at random jobs.

I have not been able to find a correlation between jobs that halt the system. Some are make to order and some are make to stock. Revs match, part numbers match, etc. I cant keep wasting time completing and rerunning the scheduler as thatā€™s not going to help us once we start to use it in the Live system.

Any ideas as to what options i might be missing?

Joe,
Can you confirm what version of Epicor you are working on this issue with?
Usually one of the things I try when I run into scheduling issues that donā€™t seem to be related to a specific job or resource is to run two User Conversion Programs from the Conversion Workbench:

  • 1070 - Refresh the ShopLoad table.
  • 2180 - Delete ResourceTimeUsed table records that does not exist on JobHead

Then I run the ā€˜Generate shop Capacity Processā€™ (System Management > Schedule Processes)
If I remember right, when you change certain values on a Resource you will get a warning to run this process.
From time to time I have to run this process about 180 days out and I enable the Delete Shop Caps.

There are also a few other User conversions that might apply to your problem to check with Epicor Support on:

  • 1070 - Refresh ShopLoad Table
  • 1140 - To create/fix the ShopCap.Capacity totals
  • 1230 - Create/Rebuild ShopLoad information
  • 2180 - Delete ResourceTimUsed table record that do not exist
1 Like

We are currently running 10.0.700.2 .

1040 Recalculate Low Level Codes.

1070 Refresh Shopload Table

1140 To create/fix Shopcap capcity totals

1150 This Program rebuilds Shopload Records

1230 Create/Rebuild ShopLoad information

8900 Recalculate the number of machines on a Resource Group

1260 This program will find Jobs where the final assembly child pointer field was corrupt and assign them a correct value. This same logic will be done to the QuoteAsm file.

I also ran the 2180 as you mentioned.

I will re run the generate shop capacity again because i havenā€™t run that since we last ran the corrections. I will also run the Global Sched Order before trying another scheduling run.

Joe,
Any update on this? At a EUG meeting and this came up.
Thanks!

Rick, Unfortunately no, we are still unable to run finite scheduling. Epicor support just got back to me this morning with possible fixes but i have not been able to read over the whole email. One thing i did notice right off the bat was that under each operation (job or EngWorkBench) you should only have ONE of the following filled in: capability, resource group, and resource. I know my company has filled in most of the resource groups AND resources. So i will need to take a look at this along with the long list of other possible fixes. i will update tomorrow with my findings.

Joe,
That is interesting because I think the other company at the EUG that was having problems with Finite Scheduling had mentioned that they had selected both Resource Group and Resource on a portion of their Methodsā€¦

I did think though that selecting all three or at least two a capability & either a Resource Group or Resource was allowedā€¦ but I wonder why Epicor does not prevent you from selecting more than oneā€¦

Thanks for getting back and the update. I will try to get this to the other company for them to test as well.

-Rick
www.getaligned.solutions

Rick,

Glad i could help out a little. I am going to copy paste the long winded solutions sheet Epicor Support sent me in hopes that in can help you and your client. i am currently trying to update all of our resources but one other little fix that made a difference was we did not have a calendar setup in the site maintenance menu (this is also on the list). Its almost like a backup calendar so if the scheduler comes across a resource with no calendar, it will default to the site maint. calendar instead of having nothing at all. i changed that and removed double resources on a job and i was able to schedule without a problem. Anyway here is the VERY long list from Epicor. Hope it helpsā€¦

MRP/Scheduling process is hanging ā€“ looping why? May also be running at a very slow speed, takes a long time to complete, the MRP log files may appear to have a lot of idle time in them where it appears no data is being processed; entries may say Waiting For Next Part, Waiting for Next job. Possibly in Job Entry manually scheduling a job may also hang or be looping.

RESOLUTION
There are several reasons why when running Global Scheduling/MRP the system hangs loops. And in most cases it is due to a file which is growing out of control and the only way to stop it is to stop and restart the apps servers. Below is a list of things to look for when trying to determine what is causing this. And in all below examples this will happen when users are also finitely scheduling.

  1. Operations have both the same Resource Group
    ā€“ Resource list relationship in Job Entry when viewing Job Entry and for a selected operation look at the Job details tab > Operations tab > Scheduling Resources tab > Detail tab.
    If the Resource here is from within the same Resource Group, this can definitely cause MRP/Global Scheduling to hang. When reviewing the scheduling logs you need to determine where it starts looping or you might see that it is finding Conflict with another job and the logs show the same job number repeated over and over.
    If the Conflict is another job, you need to bring that job up in job entry and the logs will tell you what assembly sequence/operation to look at. If you determine that it is because of the resource group/resource then the user needs to determine which one to schedule to and remove the other one from this job operation and reschedule the job. If it is determined that their Methods are setup in the same fashion, as viewed in Method Tracker / Engineering Workbench they will need to fix those as well, per the Engineering Workbench and/or Operation maintenance under Job Mgt > Setup > Operation. A Resource Group and one of its own Resources cannot be here as Scheduling Requirements although the software allows you to add them as such and save.

  2.        Calendar ID has a space
    

    When reviewing the scheduling logs when you determine where the scheduler starts looping you then need to go to job entry for that job and look at that assembly/operation sequence and then go to the resource group and view the calendar. If it is determined that the calendar has a space in front of the ID will need to request a fix program to eliminate the space.

  3.        Operation has long estimated production hrs. example 100 +
    

    When reviewing the scheduling logs when you determine where the scheduler starts looping you then need to go to job entry for that job and look at that assembly/operation to look at the total Productions Hours. If these hours are over 100 the user needs to determine if this is a legitimate standard or not and if not change it so these hours are less than 100 and reschedule the job.

  4.        Establish a finite horizon (When using finite scheduling)
    

    If it is determined that the customer has demand which goes out further than 1 year and they have no Finite Horizon set up on their resource groups/resources they might want to consider establishing one. They also might want to refer to the scheduling technical reference guide if they have questions as to how this will effect scheduling.

  5.        Check Resource Groups/Resources and ensure the number of Resources in the tree view on the left match the number showing on the resource group form Detail tab under the Scheduling section on the right, field titled Number of Resources.   
    

    If it is determined that this is the issue to fix it all that needs to be done is go to the resource tab and move one resource out and move it back in and now see if the view and form will be in synch. This can be verified by reviewing the tree view again and comparing it to number of resources on the resource group form.
    But the user will need to run Calculate Scheduling Order and Global Scheduling to fix all current jobs in the system so they are scheduled appropriately.
    (this may be related to Change Request Page 38416mps-8772esc and SCR 81671)

  6.        Production Calendars which have a gap in them (they run 8 hour shifts but have 4 consecutive hours checked followed by a gap of one or more hours not checked then the remaining 4 consecutive hours are checked.)  Most users might set up calendars with a gap for lunch but this is not the place to identify it, that is in employee set up or Shift maintenance itself.  
    

    This causes looping because the scheduler is looking for a consistent run of time and is not able to make the jump over the gap. This unchecked box must be checked. Also, do not forget to go into Resource Group maintenance and look under the Calendar Exceptions tab for the group itself to see if this gap problem exists on any certain days. Also look in the Resources tab > Calendar for similar exceptions that may have been established for a specific resource.
    See the Word do attachment to this page = 13766mps-6-screenpics.docx for more information and details on this topic.

  7. Company Maintenance /Company Configuration and Plant Maintenance / Site Maintenance screens all have Production Calendar fields that must be populated. Make sure these fields are populated with valid working calendars from Production Calendar Maintenance.

  8. Similar screen referred to above in Job Entry per step 1, Job details tab > Operations tab > Scheduling Resources tab > Detail tab. Middle section with the fields for Capability / Resource Group / Resource, only one of these fields should be populated. If more than one of these are populated for a Scheduling Resource below an op, this can cause any scheduling function to hang, Job Entry ā€“ MRP ā€“ Global functions. Make sure only one of these fields is populated, here and in the Method Trackerā€“Engineering Workbench where Get Details is done from. Do Actions > Job Details > Operations > Add Scheduling Resource to add another Resource Scheduling item here with only one of these fields populated. See the Word doc attachment to this page = 13766mps-7-screenpics.docx for more information and details on this topic.

  9. In the Resource Group Detail screen, do not check the Use Calendar for Move Time and Use Calendar for Queue Time fields unless the related fields for Queue Hours and Move Hours are set to greater than 0.00. If the hours fields are left at zero and one or both of these boxes is checked, MRP ā€“ Scheduling may hang, lock, not complete or not run correct. See Word doc attachment to this Page = 13766mps-8-screenpics.docx for this screen.

  10. In Resource Group entry, on the main Detail tab in the lower right uncheck the Use Calendar for Move Time and Use Calenar for Queue Time boxes if the Move and Queue Hours fields are left at 0.00 above and also in the Resources tab. When these fields are left at zero, these boxes should not be checked.

  11. Resource Group Detail screens and Resource Detail screens, make sure the calendar fields are populated with valid calendars. If these fields are blank here, then the software looks for default calendars in the Company maintenance and Plant maintenance screens and their Calendar fields,so make sure these fields are also populated with valid calendars.

  12. If the AMM module, Advanced Material Management, is installed, then go to Resource Group entry, for each group in the Detail screen the Location box must be checked and the Input and Output Warehouse and Bin fields must be populated. Same for each resource in the Resources tab > Detail tab. See the Word doc attachment to this Page = 13766mps-10-screenpics.docx.

  13. Database Appserver log, see Page 7469mps for more details, located in the
    \epicor\epicor905\server\logs directory, look for the databaseID.server.log file. Look for entries recorded during the time that MRP is running and more towards the time that it appears to stop and hang ā€“ lock that indicate it is conflicting with another function or process on the server. Recorded lines here may directly Conflict, Conflicts, Conflicting.
    Or these lines may say something about a BPM created under System Management > Business Process Management. Turn off these BPMs or possibly all BPMs prior to running MRP to avoid lockups.

  14. Database Replication running at the same time as MRP can also cause it to hang ā€“ lock, again turn this off before MRP is run. Replication is a process where data in the current database is copied to a different database. A recorded line in the appserver log may look like this while it is running:
    ā€“ (Procedure: ā€˜replicate Bpm/DataTrigger.pā€™ Line:-1)

  15. Run the 5100 database conversion to help possible low level code problems for parts that may be a bit out of synch or corrupted. (Version 10.0+, itā€™s now the 1040, see Page 164542esc below for instructions) Located of the server in the Database Admin Tools > click open the Conversion Programs button. This will resynch these codes for parts that may be corrupt and out of synch. See general info Page 3588mps for more information about these codes. Stored in the Part table, LowLevelCode field, inital value is 0. Short definitiion, it represents the lowest level indentation a part is used in parent Revision ā€“ BOM structures.
    Do a BAQā€“Export on the part table and include this LowLevelCode field. If any value looks incorrect, in the 100ā€™s, 1,000ā€™s, then this 5100 conversion will definitely have to be run to try and correct these values to their correct amount.
    An early clue here may be when MRP runs and its log file says Building PartList Level over and over followed by numbers. These numbers represent the Low Level Code ā€“ BOM level indentation it is currently processing; if numbers here look abnormally high and unrealistic, it could indicate Lowlevelcode problems that need to be fixed and corrected by this 5100 conversion. A manual fix program may also be required from support that has to be installed and run if the 5100 conversion does not fix the problem low level codes

  16. The Revision / BOM for a parent part calls itself out as a material or subassembly of itself. This may also cause its Low Level Code field to be inflated far above its normal amount. When looking a parent revision either in the Engineering Workbench or Method Tracker, look thru its Material list. If this same parent part is also a part number in a Material sequence Detail screen, delete this sequence or change the part ID itself.
    If there are subassemblies, this parent also cannot be the part ID in any Subassembly Detail screen nor can it be the part in any material sequence below any subassembly.
    In the Engineering Workbench, when you try to add the parent below itself in either of the above scenarios you will be stopped by the message " Circular Reference " either when trying to save the material or when Approving the Revision and attempting the Check In back to Part entry. So if the above problem exists, the data must have been imported into the database from outside sources and not created manually in the Engineering Workbench.
    A clue that the above problem exists and is causing MRP problems is to input a log file on the Process MRP screen, then look at the related MRP basic log file that is created while MRP runs. Shortly after it starts running it will go into a section where every entry is titled Building PartList Level: xx where the xx represents the Low Level Code assigned to a part that is processed. Usually this will be anywhere from 0 up to 5 or 10, maybe 20 or higher, dependent on the lowest subassembly level that a part is used at.
    If this number starts becoming inflated and is abnormally high for all these Building Partlist entries, such as in the hundreds, thousands, this indicates that MRP has encountered a part or some parts with the above Circular problem and is now hung up and looping on them.

RESOLUTION:
Do an export from the Part table selecting just a few fields such as PartNum, PartDescription, LowLevelCode. Find the parent parts that have an inflated, abnormal code. Then go to Part entry and checkout their revision to the Engineering Workbench and correct the above problem with the material sequence or subassembly sequence of the same part ID. Followup by running the 5100 database conversion per above step 10 to properly synch up these codes that may also be out of sync at this point.
If for some reason the above Circular problem does not exist but inflated Low Level Codes still exist causing the looping/hanging problem, then the fix program from change request Page 37099mps may need to be sent out to fix the problem codes. And if this fix does not work, then a copy of the database will need to submitted to Epicor support and programming for further review.

Part table > LowLevelCode field definition. (from the Data Dictionary in System Management)
Internally assigned integer which indicates the deepest level of assembly indention that this part is used at. This is used by the Cost Rollup routines to control the order in which parts get costed. Part at the bottom (highest levelcode) Product structure are calculated first and continues up the chain, with the final assembly parts being processed last. This insures that when retrieving the cost of an assemblies components the components will already have had their cost rolled up.

  1. If there are multiple scheduling resources in a Job Operation, as viewed in Job Entry in the Job Details tab > Operations tab > Scheduling Resources tab > List tab, the related Production Calendars for these Resources / Resource Groups must have some checked availability hours that overlap. If not, this will cause the Scheduling engine program to hang.
    Example, say you have resrc1 and resrc2 listed here in this List tab that need to be scheduled for the operation. Go to Resource Group entry and find their applicable calendars. Then go to Production Calendar maintenance, select these calendars to see what hours are checked.
    If one calendar only has hours 1 thru 8 checked while the other calendar only has hours 10 thru 18 checked, the Scheduling program sees this and will lock up and hang because it needs to schedule these resources at the same time, concurrently. It cannot schedule one resource for hours 1 thru 8, Midnight to 8am while it also tries to schedule the other resource at the same time for hours 10 thru 18, 10am to 6pm.
    At least one or more hours would have to overlap here and a change needs to be made. Perhaps the hours 1 thru 8 need to be changed to 4 thru 12 so there is some overlap with hours 1o thru 12.
    If all appears to check out OK here in the Production Calendar screens, then go back to Resource Group entry and look at the related groups and resources. Perhaps changes have been made to the default available checked hours in the either the Calendar Exceptions tab or in the Resources > tab > Calendars tab.

  2. Run the 6430 database conversion, Recalculate Part Onhand/Allocation summaries. It may help resynch underlying tables and fields for part data that is out of synch and corrupt. Problems with part data could eventually lead to MRP and Job Scheduling problems.

  3. Also run the 6920 conversion; Delete PartDtl with Invalid Job References. This should delete bad records in the PartDtl table from older jobs that should no longer be processed as part of the current MRP runs. This conversion may not be available in version 8.0x. The PartDtl table holds current Demand and Supply records to be processed by MRP. An export-query-BAQ should also be done from this table to view these possible bad records.

  4. In all above scenarios, to properly troubleshoot the issue we would need Appserver logs and MRP/Scheduling logs to assist further.

  5. Add the companies to the Task Agent user IDs.

Page 16452esc Version 10.xxx Database conversions

PROBLEM DESCRIPTION:
Users need to run a database conversion support suggested to resolve an issue.
Where are these database conversion located in Epicor ERP? 10 10.0 10.0.700

VERSION: 10.0+

PROBLEM RESOLUTION:

  1. From the main menu go to System Management> Upgrade/Mass Regeneration> Conversion Workbench
  2. In the left side tree view select User Run Conversion
  3. Search for the conversion you need to run on the right and double click on it in the ID or Description column.
  4. This opens a new popup box titled Data Conversion Maintenance.
  5. From the Actions dropdown menu select ā€˜Submitā€™ to apply and run the conversion
  6. This Popup box has a section at the bottom titled Run Log, it will have data about the process being run.
  7. Check back in the list of conversions, for the one running now, the ProgStatus column will be Started at the start and change to Complete when it is done.
  8. Close the Popup box when the conversion is complete.
  9. Exit this screen or double click open a different conversion to run if needed.
8 Likes

All I can say is WOW :frowning:

Yeah, Iā€™m still digesting thisā€¦ a lot of things that can break finite schedulingā€¦ Or maybe support grabbed a their book and were like, ā€˜maybe your problem is in hereā€™ā€¦ crazy.
I know a lot more nowā€¦ I think. :exploding_head:

2 Likes

Iā€™m having a hard time just reading through that! I printed it and will try to digest it laterā€¦I thinkā€¦

Multi-Job Scheduling Bug: If you leave RelatedOp as 0 it breaks scheduling despite the field help saying you can leave it at 0 sigh.

2 Likes

What do you mean by break?
It may just assume the first operation or job start date if related operation = 0.
Is that the result your seeing?

Patrick Winter

Yes if I schedule a Job Normallyā€¦ But when using Epicorā€™s New Multi-Job where you Demand Children Job (Job to Job)ā€¦ Then Visually the 0 RelatedOp does as it shouldā€¦ but when you run Global Scheduling / MRP it will defunct/crash.

I still cannot run Finite Global Scheduling but going back to that giant list, i tested item 8 on a job that would hang and made sure to only have 1 Capability, Resource Group, or Resource. When i changed this i re-ran global scheduling and epicor scheduled my job without an issue. It then went on to hang on the next sequential job. When i attempted a DMT to edit all the jobs Capability, Resource Group or Resource the dmt did not stick and the information stayed the same.

i am attempting to create an updateable BAQ to make the changes but im having a tough time building one so this may take some time. if anyone can comment on helping me with the BAQ id greatly appreciate it.

Rick;
I know, old thread, lol
We are having the same issue in E9.05.702 and I am testing in the Pilot. We are in process of E10 so this cleanup in our E9 system is important.

The list that Joe provided from Support was excellent, and we were in violation of many of those things, now fixed.
One last questionā€¦ We use MRP and thus have a bunch of MRP Jobs at any given time that may not be Firmed, and therefore are bogus jobs. Since Epicor schedules these Jobs, I thought it may be helpful to delete all MRP Jobs prior to Global Re-Schedule? Does deleting the Job remove it from the schedule or do I even need to worry about deleting them or how they currently appear in the schedule.
I am just trying to eliminate anything that would affect the GLobal Re-Schedule and those see to be just useless bulk that I donā€™t need.

Any advice appreciatedā€¦
George Hicks

1 Like

I was able to fix all the things in the list above from Epicor Support so that was very helpful.
I ran the Global Reschedule and though I though it was hung up, it took (3) days to finish. This was in the PILOT system, so I know it is slower but that performance is just horrible! Not a real option to Global Reschedule daily.
Hmmmm Still in the same boatā€¦

Got more details? Are you using Min/Max, Post perhaps the Error Logs. 3 days, how many jobs are you scheduling :o

1,300 Open Jobs
1,257 MRP Jobs that get overwritten every night my MRP but cannot be excluded from Global Re-Schedule.
I tried setting their priority to last in the Global Scheduling Order screen before the Re-Schedule.
I am using Finite Backward Scheduling with ā€œAllow dates before todayā€ selected in the Company/Plant. I am sure a Forward schedule would be faster, as it goes from today out, but that does not reflect reality.

I would love an easy way to remove all MRP Jobs from the schedule and then just schedule the real jobs, but that just makes too much sense, lol.

2 Likes

How do you delete an Active Task in System Monitor? I was trying to run global Scheduling but, EPICOR got hung up while calculating the Scheduling Order Process.

Did you ever get a response?